[FFmpeg-devel] [RFC] ac3dec: use dsputil.clear_block

Måns Rullgård mans
Wed Jan 13 23:15:21 CET 2010


Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:

> On Wed, Jan 13, 2010 at 09:03:32PM +0000, M?ns Rullg?rd wrote:
>> Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:
>> 
>> > Hello,
>> > this gives an overall speedup of about 1.1 % on Intel Atom with my sample.
>> > Testing with other CPUs and samples heavily welcome, I suspect a slowdown may 
>> > be possible, beside it being a bit ugly.
>> > Index: libavcodec/ac3dec.c
>> > ===================================================================
>> > --- libavcodec/ac3dec.c	(revision 21191)
>> > +++ libavcodec/ac3dec.c	(working copy)
>> > @@ -565,6 +566,7 @@
>> >   */
>> >  static void decode_transform_coeffs(AC3DecodeContext *s, int blk)
>> >  {
>> > +    const int clearsize = 64 * sizeof(DCTELEM) / sizeof(s->fixed_coeffs[0][0]);
>> >      int ch, end;
>> >      int got_cplchan = 0;
>> >      mant_groups m;
>> > @@ -586,9 +588,12 @@
>> >          } else {
>> >              end = s->end_freq[ch];
>> >          }
>> > -        do
>> > -            s->fixed_coeffs[ch][end] = 0;
>> > -        while(++end < 256);
>> > +        while (end & (clearsize - 1))
>> > +            s->fixed_coeffs[ch][end++] = 0;
>> > +        while (end < 256) {
>> > +            s->dsp.clear_block((DCTELEM *)s->fixed_coeffs[ch] + end);
>> > +            end += clearsize;
>> > +        }
>> >      }
>> 
>> Did you try a simple memset()?
>
> In the variants I tried it it was horribly slow.

Maybe we need a more generic memset-like function.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list