[FFmpeg-devel] [PATCH 09/10] lavc/flacenc: add sse4 version of the 32-bit lpc encoder

James Almer jamrial at gmail.com
Wed Feb 12 23:33:20 CET 2014


On 12/02/14 11:55 AM, James Darnley wrote:
> On 2014-02-12 07:49, Clément Bœsch wrote:
>> On Wed, Feb 12, 2014 at 12:11:21AM +0100, James Darnley wrote:
>>> From 1.3 to 2.5 times faster.  Runtime reduced by 4 to 58%.  As with the
>>> 16-bit version the speed-up generally increases with compression_level.
>>>
>>> Also like the 16-bit version, it is not used with levels less than 3.
>>> ---
>>>  libavcodec/x86/flac_dsp_gpl.asm |   97 +++++++++++++++++++++++++++++++++++++++
>>>  libavcodec/x86/flacdsp_init.c   |    5 ++
>>>  2 files changed, 102 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/libavcodec/x86/flac_dsp_gpl.asm b/libavcodec/x86/flac_dsp_gpl.asm
>>> index 9e9249a..e36c76b 100644
>>> --- a/libavcodec/x86/flac_dsp_gpl.asm
>>> +++ b/libavcodec/x86/flac_dsp_gpl.asm
>>> @@ -22,6 +22,14 @@
>>>  
>>>  %include "libavutil/x86/x86util.asm"
>>>  
>>> +SECTION_RODATA
>>> +
>>> +pd_0_int_min: times  2 dd 0, -2147483648
>>> +pq_int_min:   times  2 dq -2147483648
>>> +pq_int_max:   times  2 dq  2147483647
>>> +
>>> +SECTION .text
>>> +
>>>  INIT_XMM sse4
>>>  %if ARCH_X86_64
>>>      cglobal flac_enc_lpc_16, 6, 8, 8, 0, res, smp, len, order, coefs, shift
>>> @@ -89,3 +97,92 @@ movd m3, shiftmp
>>>      sub lenmp, (3*mmsize)/4
>>>  jg .looplen
>>>  RET
>>> +
>>> +%macro PMINSQ 3
>>> +    mova    %3, %2
>>> +    pcmpgtq %3, %1
>>
>> pcmpgtq %3, %2, %1
> 
> I can certainly change that but it won't have any useful effect without
> a version of the function that allows instructions with 3 operands.
> 

You have a pmuldq + paddq that can be replaced with a vpmacsdql once my 
FMA_INSTR is committed that will give a big boost on BD/PD/SR if a XOP 
version is added.
Said version will benefit from any three operand SSE* instruction as well.

> I'm sure there are a few other places that could benefit from this and
> maybe "new" instructions as well.  I just need to grok the instructions
> and then identify where they might be useful.
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 



More information about the ffmpeg-devel mailing list