[FFmpeg-devel] [PATCH] strict-aliasing-safe aes.c

Michael Niedermayer michaelni
Wed Jun 30 00:17:06 CEST 2010


On Wed, Jun 30, 2010 at 12:05:26AM +0200, Reimar D?ffinger wrote:
> On Tue, Jun 29, 2010 at 11:29:42PM +0200, Michael Niedermayer wrote:
> > On Tue, Jun 29, 2010 at 07:27:44PM +0200, Reimar D?ffinger wrote:
> > > On Tue, Jun 29, 2010 at 06:16:03PM +0200, Reimar D?ffinger wrote:
> > > > On Tue, Jun 29, 2010 at 05:10:09PM +0100, M?ns Rullg?rd wrote:
> > > > > Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:
> > > > > >> Are these API changes necessary?  Can't the rest of the fixes be done
> > > > > >> without this?
> > > > > >
> > > > > > No, they are not necessary, however I am in favour of at least having
> > > > > > av_aes_block available to allow for easy allocation of data
> > > > > > with suitable alignment.
> > > > > 
> > > > > Exposing internals like that is less than ideal.  Simply documenting
> > > > > the 8-byte alignment requirement should be enough.
> > > > 
> > > > I guess I agree, I'll make the change if the idea is ok'd in principle.
> > > 
> > > Oh well, I did it anyway already since I noticed I broke the more-than-one
> > > block case.
> > 
> > >  aes.c |   74 +++++++++++++++++++++++++++++++++++++-----------------------------
> > >  1 file changed, 42 insertions(+), 32 deletions(-)
> > > 172e32357bec3f033440078a57b908ff696472d7  aesalias.diff
> > 
> > if its confirmed that there is no speedloss and the object file does not
> > become larger than i wont object to the patch. But its still ugly
> 
> Well, I had to test with gcc-4.3, since 4.4 does not produce working code...
> Pure assembler size increases from 3957 bytes to 4025, for what I can tell
> mostly because gcc randomly decides to use some extra bytes for the local
> variables, which just makes it having to use 4-byte instead of 1-byte offsets
> to address function parameters, which in turn then causes some alignment
> to increase as well.
> As for speed, with aes-test the results seem to indicate a surprising
> 2% performance advantage for the new code (I'd expect due to some
> other random code or data re-shuffling):
> old:
> 13377 dezicycles in aes, 8191 runs, 1 skips
> 13387 dezicycles in aes, 8192 runs, 0 skips
> 13368 dezicycles in aes, 8191 runs, 1 skips
> 13360 dezicycles in aes, 8191 runs, 1 skips
> 14279 dezicycles in aes, 8191 runs, 1 skips
> 
> new:
> 12929 dezicycles in aes, 8191 runs, 1 skips
> 12926 dezicycles in aes, 8191 runs, 1 skips
> 12916 dezicycles in aes, 8191 runs, 1 skips
> 12913 dezicycles in aes, 8191 runs, 1 skips
> 12973 dezicycles in aes, 8191 runs, 1 skips
> 
> I'd welcome someone else to test.
> Does this seem good enough to you?

commit it before i change my mind

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100630/228bd327/attachment.pgp>



More information about the ffmpeg-devel mailing list