[FFmpeg-devel] [PATCH 2/3] avcodec/aacsbr: Add comment about possibly optimization in sbr_dequant()

Fri Dec 11 18:27:16 CET 2015

On 11.12.2015 18:09, Ganesh Ajjanagadde wrote:
> On Fri, Dec 11, 2015 at 11:36 AM, Andreas Cadhalpun
> <andreas.cadhalpun at googlemail.com> wrote:
>> On 11.12.2015 17:21, Ganesh Ajjanagadde wrote:
>>> On Fri, Dec 11, 2015 at 11:16 AM, Andreas Cadhalpun
>>> <andreas.cadhalpun at googlemail.com> wrote:
>>>> On 19.11.2015 14:17, Michael Niedermayer wrote:
>>>>> From: Michael Niedermayer <michael at niedermayer.cc>
>>>>>
>>>>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
>>>>> ---
>>>>>  libavcodec/aacsbr.c |    1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/libavcodec/aacsbr.c b/libavcodec/aacsbr.c
>>>>> index d1e3a91..e014646 100644
>>>>> --- a/libavcodec/aacsbr.c
>>>>> +++ b/libavcodec/aacsbr.c
>>>>> @@ -73,6 +73,7 @@ static void sbr_dequant(SpectralBandReplication *sbr, int id_aac)
>>>>>  {
>>>>>      int k, e;
>>>>>      int ch;
>>>>> +    //TODO: Replace exp2f(0.5*x) by a LUT, the inputs are all integer and have a small range
>>>>>
>>>>>      if (id_aac == TYPE_CPE && sbr->bs_coupling) {
>>>>>          float alpha      = sbr->data[0].bs_amp_res ?  1.0f :  0.5f;
>>>>>
>>>>
>>>> This shouldn't hurt, with or without the clarification requested by Ganesh.
>>>
>>> I am doing related work cleaning up and optimizing usages of slow libm
>>> functions such as pow and exp2. Do you know the exact possible range
>>> of the inputs x, and if so, can it be added to the comment? That will
>>> be very helpful for me to come up with a patch. Thanks.
>>
>> The exp2f expressions are:
>> exp2f(sbr->data[0].env_facs_q[e][k] * alpha + 7.0f);
>> exp2f((pan_offset - sbr->data[1].env_facs_q[e][k]) * alpha);
>> exp2f(NOISE_FLOOR_OFFSET - sbr->data[0].noise_facs_q[e][k] + 1);
>> exp2f(12 - sbr->data[1].noise_facs_q[e][k]);
>> exp2f(alpha * sbr->data[ch].env_facs_q[e][k] + 6.0f);
>> exp2f(NOISE_FLOOR_OFFSET - sbr->data[ch].noise_facs_q[e][k]);
>>
>> Here alpha is 1 or 0.5, pan_offset 12 or 24 and NOISE_FLOOR_OFFSET is 6.
>> After patch 3 of this series, env_facs_q is in the range from 0 to 127 and
>> noise_facs_q is already limited to the range from 0 to 30.
>>
>> So x should always be in the range -300..300, or so.
> 
> Very good, thanks a lot.
> 
> Based on the above range, my idea is to not even use a LUT, but use
> something like exp2fi followed by multiplication by M_SQRT2 depending
> on even or odd. This will not bloat the binary, but is still very fast
> and avoids huge variability in performance. That should provide a good
> baseline (see jpeg2000 for this idea), further tweaks can be done (e.g
> using an exp2i, i.e double precision to avoid possible branching for
> the overflow/underflow cases).

That sounds good. :)

> Maybe exp2i and/or exp2fi could be moved to avutil/internal or more
> appropriately avcodec/internal as they have utility in this, jpeg2000,
> and at least one other place in avcodec (which I can't recall).

Moving to avcodec/internal should be fine.

> Will be addressed in a week or so, unless someone does it before then.
> This is very quick to do, and so this patch may not be needed.

Indeed, no need to add the comment, if it's removed a week later.

Best regards,
Andreas