[FFmpeg-devel] [PATCH 3/3] Use DSPContext.vector_fmul() and DSPContext.vector_fmul_reverse() in floating-point version of apply_window(). 46% faster in function apply_window().

Justin Ruggles justin.ruggles
Sun Jan 2 04:30:32 CET 2011


On 01/01/2011 10:09 PM, Michael Niedermayer wrote:

> On Fri, Dec 31, 2010 at 03:11:40PM -0500, Justin Ruggles wrote:
>> diff --git libavcodec/ac3enc_float.c libavcodec/ac3enc_float.c
>> index 6a061d6..addc84f 100644
>> --- libavcodec/ac3enc_float.c
>> +++ libavcodec/ac3enc_float.c
>> @@ -77,16 +77,13 @@ static void mdct512(AC3MDCTContext *mdct, float *out, float *in)
>>  /**
>>   * Apply KBD window to input samples prior to MDCT.
>>   */
>> -static void apply_window(float *output, const float *input,
>> +static void apply_window(DSPContext *dsp, float *output, const float *input,
>>                           const float *window, int n)
>>  {
>> -    int i;
>>      int n2 = n >> 1;
>> -
>> -    for (i = 0; i < n2; i++) {
>> -        output[i]     = input[i]     * window[i];
>> -        output[n-i-1] = input[n-i-1] * window[i];
>> -    }
>> +    memcpy(output, input, n2 * sizeof(*input));
>> +    dsp->vector_fmul(output, window, n2);
>> +    dsp->vector_fmul_reverse(output+n2, input+n2, window, n2);
> 
> The memcpy is ugly


yeah, I know...  I'll see if I can implement a new version of
vector_fmul that will handle different input from output and compare the
speed.

-Justin



More information about the ffmpeg-devel mailing list