[FFmpeg-devel] [PATCH] iirfilter: Use local variables for state in loop for FILTER_O2().

Justin Ruggles justin.ruggles
Sun Jan 30 20:05:57 CET 2011


On 01/30/2011 01:51 PM, M?ns Rullg?rd wrote:

> Justin Ruggles <justin.ruggles at gmail.com> writes:
> 
>> 4% faster 2nd order ff_iir_filter_flt().
>> ---
>>  libavcodec/iirfilter.c |   12 +++++++-----
>>  1 files changed, 7 insertions(+), 5 deletions(-)
>>
>> I tried most of yesterday and this morning trying to make an asm
>> version of the float biquad filter, but nothing I came up with was
>> faster than what gcc did with the C version.  I did, however manage
>> to speed up the C version by about 4% by adding local variables
>> inside the inner loop for the 2 states.
>>
>> diff --git a/libavcodec/iirfilter.c b/libavcodec/iirfilter.c
>> index bc63c39..dd593dd 100644
>> --- a/libavcodec/iirfilter.c
>> +++ b/libavcodec/iirfilter.c
>> @@ -261,11 +261,13 @@ av_cold struct FFIIRFilterState* ff_iir_filter_init_state(int order)
>>      const type *src0 = src;                                             \
>>      type       *dst0 = dst;                                             \
>>      for (i = 0; i < size; i++) {                                        \
>> -        float in = *src0   * c->gain  +                                 \
>> -                   s->x[0] * c->cy[0] +                                 \
>> -                   s->x[1] * c->cy[1];                                  \
>> -        CONV_##fmt(*dst0, s->x[0] + in + s->x[1] * c->cx[1])            \
>> -        s->x[0] = s->x[1];                                              \
>> +        float s0 = s->x[0];                                             \
>> +        float s1 = s->x[1];                                             \
>> +        float in = *src0 * c->gain  +                                   \
>> +                   s0    * c->cy[0] +                                   \
>> +                   s1    * c->cy[1];                                    \
>> +        CONV_##fmt(*dst0, in + s0 + s1 * c->cx[1])                      \
>> +        s->x[0] = s1;                                                   \
>>          s->x[1] = in;                                                   \
>>          src0 += sstep;                                                  \
>>          dst0 += dstep;                                                  \
> 
> Why do you do load/store the struct values in the loop?  Wouldn't it
> be better to load the x[] values to locals before the loop and write
> them back after?  You might try doing the same with c[xy] as well.


Yes, that was my first thought, and I tried with state and with coefs
outside the loop.  They were slower.

-Justin



More information about the ffmpeg-devel mailing list