[FFmpeg-devel] [PATCH] Move MLP's dot product to DSPContext

Michael Niedermayer michaelni
Mon Apr 20 14:40:32 CEST 2009


On Mon, Apr 20, 2009 at 02:29:09AM -0300, Ramiro Polla wrote:
> Hi,
> 
> On Mon, Apr 20, 2009 at 12:14 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Sun, Apr 19, 2009 at 10:10:05PM -0300, Ramiro Polla wrote:
> >> Attached file move MLP's dot product to DSPContext. The filter order
> >> is a maximum of 8, and in the rematrix stage it's a maximum of 5+2
> >> channels for MLP and 7+0 channels for TrueHD, so it all fits in 8
> >> (hopefully) optimized functions.
> >
> > the functions are too small, the call overhead is too much
> > 1-8 multiplicatons and 1-8 additions is not enough ...
> 
> I thought that would happen too, but strangely there was a speedup.

you wrote the whole function in asm() and that was slower?


[...]
> >> {
> >> ? ? int64_t accum = 0;
> >>
> >> ? ? while (order--)
> >> ? ? ? ? accum += (int64_t) *state++ * *coeffs++;
> >
> > switch(order){
> > case 8: accum ?= (int64_t) *state++ * *coeffs++;
> > case 7: accum += (int64_t) *state++ * *coeffs++;
> > case 6: accum += (int64_t) *state++ * *coeffs++;
> > case 5: accum += (int64_t) *state++ * *coeffs++;
> > case 4: accum += (int64_t) *state++ * *coeffs++;
> > case 3: accum += (int64_t) *state++ * *coeffs++;
> > case 2: accum += (int64_t) *state++ * *coeffs++;
> > case 1: accum += (int64_t) *state ? * *coeffs ?;
> > case 0:
> > }
> 
> This makes it 5.4% slower than the current code.

try asm()


> 
> > also state[i] * coeffs[i]; i++
> > could be tried
> 
> 8.1% faster than current code, but a bit slower then the
> while(order--) code, which gives 10.8%.

again try asm

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090420/fc65cf74/attachment.pgp>



More information about the ffmpeg-devel mailing list