[FFmpeg-devel] [PATCH] AAC decoder

Robert Swain robert.swain
Sun May 25 15:55:07 CEST 2008


2008/5/24 Michael Niedermayer <michaelni at gmx.at>:
> On Sat, May 24, 2008 at 06:35:37PM +0100, Robert Swain wrote:
>> 2008/5/23 Michael Niedermayer <michaelni at gmx.at>:
>> > On Fri, May 23, 2008 at 01:59:41PM +0100, Robert Swain wrote:
>> >> Index: aac.c
>> >> ===================================================================
>> >> --- aac.c     (revision 2185)
>> >> +++ aac.c     (working copy)
>> >> @@ -366,7 +366,7 @@
>> >>      DECLARE_ALIGNED_16(float, sine_short_128[128]);
>> >>      DECLARE_ALIGNED_16(float, pow2sf_tab[256]);
>> >>      DECLARE_ALIGNED_16(float, intensity_tab[256]);
>> >> -    DECLARE_ALIGNED_16(float, ivquant_tab[256]);
>> >> +    DECLARE_ALIGNED_16(float, ivquant_tab[128]);
>> >>      MDCTContext mdct;
>> >>      MDCTContext mdct_small;
>> >>      MDCTContext *mdct_ltp;
>> >> @@ -890,8 +890,11 @@
>> >>      // BIAS method instead needs values -1<x<1
>> >>      for (i = 0; i < 256; i++)
>> >>          ac->intensity_tab[i] = pow(0.5, (i - 100) / 4.);
>> >> -    for (i = 0; i < sizeof(ac->ivquant_tab)/sizeof(ac->ivquant_tab[0]); i++)
>> >> -        ac->ivquant_tab[i] = pow(i, 4./3);
>> >> +    for (i = 0; i < sizeof(ac->ivquant_tab)/(sizeof(ac->ivquant_tab[0])<<1); i++) {
>> >> +        int idx = i<<1;
>> >> +        ac->ivquant_tab[idx]     =  pow(i, 4./3);
>> >> +        ac->ivquant_tab[idx + 1] = -ac->ivquant_tab[idx];
>> >> +    }
>> >>
>> >>      if(ac->dsp.float_to_int16 == ff_float_to_int16_c) {
>> >>          ac->add_bias = 385.0f;
>> >
>> >> @@ -1035,13 +1038,12 @@
>> >>  }
>> >>
>> >>  static inline float ivquant(AACContext * ac, int a) {
>> >
>> >> -    static const float sign[2] = { -1., 1. };
>> >>      int tmp = (a>>31);
>> >>      int abs_a = (a^tmp)-tmp;
>> >> -    if (abs_a < sizeof(ac->ivquant_tab)/sizeof(ac->ivquant_tab[0]))
>> >> -        return sign[tmp+1] * ac->ivquant_tab[abs_a];
>> >> +    if (abs_a < sizeof(ac->ivquant_tab)/(sizeof(ac->ivquant_tab[0])<<1))
>> >> +        return ac->ivquant_tab[(abs_a<<1) + !!tmp];
>> >
>> > ehh... this should be:
>> >
>> > if(a + 127U < 255U)
>> >    return ivquant_tab[a + 127U];
>> >
>> > (or other constants depending on what table size is best ...)
>> >
>> >
>> >>      else
>> >> -        return sign[tmp+1] * pow(abs_a, 4./3);
>> >> +        return (2 * tmp + 1) * pow(abs_a, 4./3);
>> >
>> > pow(fabs(a), 1./3) * a;
>>
>> With those suggestions it is much faster. The alternating sign
>> construction for the table wasn't my idea, but I won't name names. :)
>> Anyway, see attached. Benchmarks on the same FAAC encoded South Park
>> episode:
>>
>> old size 256
> [...]
>> 3956 dezicycles in ivquant, 2096816 runs, 336 skipsup=0 drop=0
>>
>> new size 8
> [...]
>> 4840 dezicycles in ivquant, 2066668 runs, 30484 skips=0 drop=0
>>
>> new size 16
> [...]
>> 3650 dezicycles in ivquant, 2093424 runs, 3728 skipsp=0 drop=0
>>
>> new size 32
> [...]
>> 3438 dezicycles in ivquant, 2096888 runs, 264 skipsup=0 drop=0
>>
>> new size 64
> [...]
>> 3447 dezicycles in ivquant, 2096915 runs, 237 skipsup=0 drop=0
>>
>> new size 128
> [...]
>> 3431 dezicycles in ivquant, 2096918 runs, 234 skipsup=0 drop=0
>>
>> new size 256
> [...]
>> 3431 dezicycles in ivquant, 2096953 runs, 199 skipsup=0 drop=0
>>
>> new size 512
> [...]
>> 3438 dezicycles in ivquant, 2097093 runs, 59 skipsdup=0 drop=0
>>
>> It looks to me like there's little difference in performance when the
>> table is of size 32 or larger. Should I use size 32?
>
> From the numbers i see, yes 32 seems the best choice.
>
> What bitrate did your test file have? High bitrate files might be faster
> with larger tables, so if it was low bitrate then it might be worth retrying
> with some higher bitrate.

Same audio source but encoded to 320kbps with QuickTime. I included
the full listings as some table sizes seem to behave strangely based
on the number of calls.

size 32

25410 dezicycles in ivquant, 1 runs, 0 skips
42845 dezicycles in ivquant, 2 runs, 0 skips
22742 dezicycles in ivquant, 4 runs, 0 skips
17655 dezicycles in ivquant, 8 runs, 0 skips
10126 dezicycles in ivquant, 16 runs, 0 skips
8335 dezicycles in ivquant, 32 runs, 0 skips
6180 dezicycles in ivquant, 64 runs, 0 skips
5603 dezicycles in ivquant, 128 runs, 0 skips
4851 dezicycles in ivquant, 256 runs, 0 skips
4273 dezicycles in ivquant, 512 runs, 0 skips
3965 dezicycles in ivquant, 1024 runs, 0 skips
3766 dezicycles in ivquant, 2048 runs, 0 skips
3672 dezicycles in ivquant, 4095 runs, 1 skips
3562 dezicycles in ivquant, 8191 runs, 1 skips
3645 dezicycles in ivquant, 16380 runs, 4 skips
4902 dezicycles in ivquant, 30660 runs, 2108 skips
7330 dezicycles in ivquant, 58326 runs, 7210 skips
9221 dezicycles in ivquant, 117693 runs, 13379 skips
11341 dezicycles in ivquant, 245042 runs, 17102 skips drop=0
13377 dezicycles in ivquant, 503166 runs, 21122 skips drop=0
14854 dezicycles in ivquant, 1026615 runs, 21961 skips drop=0
15771 dezicycles in ivquant, 2074387 runs, 22765 skips drop=0
16429 dezicycles in ivquant, 4169262 runs, 25042 skips drop=0

size 64

3960 dezicycles in ivquant, 1 runs, 0 skips
20075 dezicycles in ivquant, 2 runs, 0 skips
11137 dezicycles in ivquant, 4 runs, 0 skips
6737 dezicycles in ivquant, 8 runs, 0 skips
6710 dezicycles in ivquant, 16 runs, 0 skips
5221 dezicycles in ivquant, 32 runs, 0 skips
4439 dezicycles in ivquant, 64 runs, 0 skips
4341 dezicycles in ivquant, 127 runs, 1 skips
4037 dezicycles in ivquant, 255 runs, 1 skips
3828 dezicycles in ivquant, 511 runs, 1 skips
3722 dezicycles in ivquant, 1022 runs, 2 skips
3649 dezicycles in ivquant, 2046 runs, 2 skips
3555 dezicycles in ivquant, 4094 runs, 2 skips
3481 dezicycles in ivquant, 8189 runs, 3 skips
3521 dezicycles in ivquant, 16380 runs, 4 skips
4443 dezicycles in ivquant, 30725 runs, 2043 skips
6640 dezicycles in ivquant, 58957 runs, 6579 skips
8893 dezicycles in ivquant, 118646 runs, 12426 skips
9859 dezicycles in ivquant, 246858 runs, 15286 skips0 drop=0
10662 dezicycles in ivquant, 504245 runs, 20043 skips drop=0
11167 dezicycles in ivquant, 1022300 runs, 26276 skips drop=0
11471 dezicycles in ivquant, 2062355 runs, 34797 skips drop=0
11718 dezicycles in ivquant, 4147408 runs, 46896 skips drop=0

size 128

5610 dezicycles in ivquant, 1 runs, 0 skips
3850 dezicycles in ivquant, 2 runs, 0 skips
2805 dezicycles in ivquant, 4 runs, 0 skips
2227 dezicycles in ivquant, 8 runs, 0 skips
2014 dezicycles in ivquant, 16 runs, 0 skips
2547 dezicycles in ivquant, 32 runs, 0 skips
2750 dezicycles in ivquant, 64 runs, 0 skips
2854 dezicycles in ivquant, 128 runs, 0 skips
2917 dezicycles in ivquant, 256 runs, 0 skips
2904 dezicycles in ivquant, 512 runs, 0 skips
2932 dezicycles in ivquant, 1024 runs, 0 skips
2933 dezicycles in ivquant, 2048 runs, 0 skips
2927 dezicycles in ivquant, 4096 runs, 0 skips
2927 dezicycles in ivquant, 8191 runs, 1 skips
2928 dezicycles in ivquant, 16373 runs, 11 skips
3949 dezicycles in ivquant, 31295 runs, 1473 skips
6272 dezicycles in ivquant, 61122 runs, 4414 skips
8573 dezicycles in ivquant, 124286 runs, 6786 skips
8344 dezicycles in ivquant, 254207 runs, 7937 skips=0 drop=0
8218 dezicycles in ivquant, 513267 runs, 11021 skips=0 drop=0
8234 dezicycles in ivquant, 1031575 runs, 17001 skips0 drop=0
7929 dezicycles in ivquant, 2070136 runs, 27016 skips0 drop=0
7687 dezicycles in ivquant, 4148129 runs, 46175 skips0 drop=0

size 256

7590 dezicycles in ivquant, 1 runs, 0 skips
770880 dezicycles in ivquant, 2 runs, 0 skips
394762 dezicycles in ivquant, 4 runs, 0 skips
203293 dezicycles in ivquant, 8 runs, 0 skips
102822 dezicycles in ivquant, 16 runs, 0 skips
53284 dezicycles in ivquant, 32 runs, 0 skips
29033 dezicycles in ivquant, 64 runs, 0 skips
16310 dezicycles in ivquant, 128 runs, 0 skips
9976 dezicycles in ivquant, 256 runs, 0 skips
6775 dezicycles in ivquant, 511 runs, 1 skips
5193 dezicycles in ivquant, 1023 runs, 1 skips
4383 dezicycles in ivquant, 2047 runs, 1 skips
3898 dezicycles in ivquant, 4094 runs, 2 skips
3434 dezicycles in ivquant, 8188 runs, 4 skips
3535 dezicycles in ivquant, 16379 runs, 5 skips
4326 dezicycles in ivquant, 32484 runs, 284 skips
5910 dezicycles in ivquant, 64603 runs, 933 skips
6857 dezicycles in ivquant, 129428 runs, 1644 skips
6261 dezicycles in ivquant, 259904 runs, 2240 skips=0 drop=0
5899 dezicycles in ivquant, 520268 runs, 4020 skipsp=0 drop=0
5739 dezicycles in ivquant, 1041312 runs, 7264 skips=0 drop=0
5409 dezicycles in ivquant, 2083386 runs, 13766 skips0 drop=0
5174 dezicycles in ivquant, 4166995 runs, 27309 skips0 drop=0

size 512

9570 dezicycles in ivquant, 1 runs, 0 skips
6215 dezicycles in ivquant, 2 runs, 0 skips
4235 dezicycles in ivquant, 4 runs, 0 skips
3093 dezicycles in ivquant, 8 runs, 0 skips
2715 dezicycles in ivquant, 16 runs, 0 skips
3238 dezicycles in ivquant, 32 runs, 0 skips
3406 dezicycles in ivquant, 64 runs, 0 skips
3488 dezicycles in ivquant, 128 runs, 0 skips
3564 dezicycles in ivquant, 256 runs, 0 skips
3551 dezicycles in ivquant, 512 runs, 0 skips
3578 dezicycles in ivquant, 1024 runs, 0 skips
3570 dezicycles in ivquant, 2047 runs, 1 skips
3568 dezicycles in ivquant, 4094 runs, 2 skips
3494 dezicycles in ivquant, 8188 runs, 4 skips
3608 dezicycles in ivquant, 16376 runs, 8 skips
3832 dezicycles in ivquant, 32692 runs, 76 skips
4276 dezicycles in ivquant, 65271 runs, 265 skips
4544 dezicycles in ivquant, 130527 runs, 545 skips
4271 dezicycles in ivquant, 261270 runs, 874 skipsp=0 drop=0
4118 dezicycles in ivquant, 522639 runs, 1649 skips
4104 dezicycles in ivquant, 1045559 runs, 3017 skips=0 drop=0
3973 dezicycles in ivquant, 2091108 runs, 6044 skips=0 drop=0
3826 dezicycles in ivquant, 4183674 runs, 10630 skips0 drop=0

size 1024

4290 dezicycles in ivquant, 1 runs, 0 skips
3300 dezicycles in ivquant, 2 runs, 0 skips
2475 dezicycles in ivquant, 4 runs, 0 skips
2117 dezicycles in ivquant, 8 runs, 0 skips
2014 dezicycles in ivquant, 16 runs, 0 skips
2533 dezicycles in ivquant, 32 runs, 0 skips
2724 dezicycles in ivquant, 64 runs, 0 skips
2825 dezicycles in ivquant, 128 runs, 0 skips
2899 dezicycles in ivquant, 256 runs, 0 skips
2901 dezicycles in ivquant, 512 runs, 0 skips
2922 dezicycles in ivquant, 1024 runs, 0 skips
3251 dezicycles in ivquant, 2048 runs, 0 skips
4046 dezicycles in ivquant, 4096 runs, 0 skips
3485 dezicycles in ivquant, 8188 runs, 4 skips
3395 dezicycles in ivquant, 16378 runs, 6 skips
3528 dezicycles in ivquant, 32748 runs, 20 skips
3511 dezicycles in ivquant, 65459 runs, 77 skips
3497 dezicycles in ivquant, 130912 runs, 160 skips
3426 dezicycles in ivquant, 261868 runs, 276 skips
3401 dezicycles in ivquant, 523754 runs, 534 skipsp=0 drop=0
3363 dezicycles in ivquant, 1047665 runs, 911 skipsp=0 drop=0
3290 dezicycles in ivquant, 2095548 runs, 1604 skips=0 drop=0
3250 dezicycles in ivquant, 4191225 runs, 3079 skips=0 drop=0

size 2048

4950 dezicycles in ivquant, 1 runs, 0 skips
18535 dezicycles in ivquant, 2 runs, 0 skips
10175 dezicycles in ivquant, 4 runs, 0 skips
9638 dezicycles in ivquant, 8 runs, 0 skips
5781 dezicycles in ivquant, 16 runs, 0 skips
4410 dezicycles in ivquant, 32 runs, 0 skips
3741 dezicycles in ivquant, 64 runs, 0 skips
3343 dezicycles in ivquant, 127 runs, 1 skips
3157 dezicycles in ivquant, 255 runs, 1 skips
3027 dezicycles in ivquant, 511 runs, 1 skips
2989 dezicycles in ivquant, 1023 runs, 1 skips
3345 dezicycles in ivquant, 2046 runs, 2 skips
3258 dezicycles in ivquant, 4094 runs, 2 skips
3580 dezicycles in ivquant, 8188 runs, 4 skips
3401 dezicycles in ivquant, 16380 runs, 4 skips
3246 dezicycles in ivquant, 32761 runs, 7 skips
3184 dezicycles in ivquant, 65513 runs, 23 skips
3149 dezicycles in ivquant, 131034 runs, 38 skips
3125 dezicycles in ivquant, 262075 runs, 69 skips
3113 dezicycles in ivquant, 524152 runs, 136 skipsp=0 drop=0
3124 dezicycles in ivquant, 1048300 runs, 276 skipsp=0 drop=0
3124 dezicycles in ivquant, 2096582 runs, 570 skipsp=0 drop=0
3109 dezicycles in ivquant, 4193283 runs, 1021 skips=0 drop=0

> [...]
>> +    for (i = 1; i < IVQUANT_SIZE/2; i++) {
>> +        ac->ivquant_tab[IVQUANT_SIZE/2 - 1 + i] =  pow(i, 4./3);
>> +        ac->ivquant_tab[IVQUANT_SIZE/2 - 1 - i] = -ac->ivquant_tab[IVQUANT_SIZE/2 - 1 + i];
>> +    }
>
> cant that be simplified with pow(fabs(i), 1./3) * i as well?

Done. I had some issues with unsigned + signed for the loop limits so
I introduced IVQUANT_SIZE_U for the unsigned size and IVQUANT_SIZE_S
for a signed version of the same. I don't know if this is the best
approach but it seems OK to me. See attached.

Rob

PS - I love that quote...

> Many that live deserve death. And some that die deserve life. Can you give
> it to them? Then do not be too eager to deal out death in judgement. For
> even the very wise cannot see all ends. -- Gandalf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 20080525-1317-merge_sign_into_ivquant.diff
Type: text/x-diff
Size: 1713 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080525/08087961/attachment.diff>



More information about the ffmpeg-devel mailing list