[FFmpeg-devel] [PATCH] 'vorbis_residue_decode' optimizations

Loren Merritt lorenm
Tue Sep 9 13:47:28 CEST 2008


On Tue, 9 Sep 2008, Siarhei Siamashka wrote:
> On Wednesday 03 September 2008, Michael Niedermayer wrote:
>>
>> This could be added as a SHOW_CONST_UBITS
>> also gcc should be able to build the mask itself at compile time as long as
>> no asm shift tricks re used.
>
> Sure. The only problem is that it would be nice to use the same macro for both
> constant and non-constant expressions. Adding one more macro does not add much
> convenience because the compiler can't either insert a constant or use asm
> shift trick automatically. Or can it?

__builtin_constant_p

> Some basic SSE optimizations are added, most likely they still can be
> improved.

You could try decoding residual in channel-interleaved order, do that 
consecutive codebook entries are consecutive in decoded memory. The simd 
savings might be worth an extra copy to deinterleave afterward.

Better yet but more complex would be to decode residual in channel-
interleaved order and don't deinterleave. That would reduce the number of 
shuffles in mdct/fft (for 2 or 4 channels), but would require new fft 
asm.

--Loren Merritt




More information about the ffmpeg-devel mailing list