[FFmpeg-devel] [PATCH] SSE optimization for DCA decoder

Michael Niedermayer michaelni
Mon Sep 1 05:03:41 CEST 2008


On Thu, Aug 28, 2008 at 10:06:45PM -0400, David Conrad wrote:
> Hi,
>
> Attached gives me about a 45% faster overall DCA decode on my penryn. Name 
> suggestions for the function welcome.
>
> Regression tests pass, and I get bit-identical output.
>
> 81883 dezicycles in ff_dca_qmf_mul_c, 16380 runs, 4 skips
> 81067 dezicycles in ff_dca_qmf_mul_c, 32761 runs, 7 skips
> 82178 dezicycles in ff_dca_qmf_mul_c, 65528 runs, 8 skips
> 82789 dezicycles in ff_dca_qmf_mul_c, 131051 runs, 21 skips
>
> 11990 dezicycles in ff_dca_qmf_mul_sse, 16270 runs, 114 skips
> 12518 dezicycles in ff_dca_qmf_mul_sse, 32538 runs, 230 skips
> 12260 dezicycles in ff_dca_qmf_mul_sse, 65126 runs, 410 skips
> 12254 dezicycles in ff_dca_qmf_mul_sse, 130235 runs, 837 skips

nice, but as you probably already know, my highlevel optimizations
broke your patch.

If you want to update it, also look at ff_mpa_synth_filter() which performs
the same windowing operation but with a quite different implementation, i
do not know which way is more efficient in SIMD, actually i dont know which
is better for C either ...

Also it would be interresting to add a float "ff_mpa_synth_filter" that
would make our mp3 deceder probably faster on normal desktop systems.
We just are missing a float 32point (type II) dct for that, else the code
from dca could be shared as is.
Basically what iam suggesting is that our mp3 decoder should get a
complete float decoding path in addition to the fixed point path ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Old school: Use the lowest level language in which you can solve the problem
            conveniently.
New school: Use the highest level language in which the latest supercomputer
            can solve the problem without the user falling asleep waiting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080901/dab80f6b/attachment.pgp>



More information about the ffmpeg-devel mailing list