[FFmpeg-devel] [PATCH] x86/flacdsp: add SSE2 and AVX decorrelate functions

James Almer jamrial at gmail.com
Sun Nov 2 23:55:35 CET 2014


On 02/11/14 7:43 PM, Clément Bœsch wrote:
> On Sun, Nov 02, 2014 at 07:31:48PM -0300, James Almer wrote:
>> Two to four times faster depending on instruction set, block size and channel count.
>>
>> Signed-off-by: James Almer <jamrial at gmail.com>
>> ---
>> TODO: 16 bits indep for 4, 6 and 8 channels. 24/32 bits indep for 8 channels.
>>       AVX2 and maybe MMX versions.
>>       Planar?
>>
>>  libavcodec/arm/flacdsp_init_arm.c |   2 +-
>>  libavcodec/flacdec.c              |   6 +-
>>  libavcodec/flacdsp.c              |   6 +-
>>  libavcodec/flacdsp.h              |   6 +-
>>  libavcodec/flacenc.c              |   2 +-
>>  libavcodec/x86/flacdsp.asm        | 206 ++++++++++++++++++++++++++++++++++++++
>>  libavcodec/x86/flacdsp_init.c     |  48 ++++++++-
>>  7 files changed, 264 insertions(+), 12 deletions(-)
> [...]
>> +    mova       m0, [in0q]
>> +    mova       m1, [in0q+in1q]
>> +%if %1 > 2
>> +    mova       m2, [in0q+in2q]
>> +    mova       m3, [in0q+in3q]
>> +%if %1 > 4
>> +    mova       m4, [in0q+in4q]
>> +    mova       m5, [in0q+in5q]
>> +%endif
>> +%endif
>> +    pslld      m0, m%2
>> +    pslld      m1, m%2
>> +%if %1 > 2
>> +    pslld      m2, m%2
>> +    pslld      m3, m%2
>> +%if %1 > 4
>> +    pslld      m4, m%2
>> +    pslld      m5, m%2
>> +%endif
>> +%endif
> 
> Can't you do something like this? (untested)
>     pslld      m0, [in0q], m%2
>     %assign i 0
>     %rep %1
>     pslld      m%i, [in0q+in%iq], m%2
>     %assign    i i+1
>     %endrep

YASM    libavcodec/x86/flacdsp.o
D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `m' (first use)
D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `i' (first use)
D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `in' (first use)
D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error: undefined symbol `iq' (first use)
D:/MinGW/msys/1.0/ffmpeg/src/libavcodec/x86/flacdsp.asm:271: error:  (Each undefined symbol is reported only once.)
make: *** [libavcodec/x86/flacdsp.o] Error 1

A %rep like that is only four lines shorter. Do you consider it more readable than the alternative to justify trying 
to get it working?


More information about the ffmpeg-devel mailing list