[FFmpeg-devel] [PATCH] Fix SSE code to not use SSE2.

Jason Garrett-Glaser jason at x264.com
Wed Mar 7 20:13:08 CET 2012


On Wed, Mar 7, 2012 at 10:47 AM, Reimar Döffinger
<Reimar.Doeffinger at gmx.de> wrote:
> On Wed, Mar 07, 2012 at 08:13:55AM -0800, Jason Garrett-Glaser wrote:
>> On Tue, Mar 6, 2012 at 11:27 PM, Reimar Döffinger
>> <Reimar.Doeffinger at gmx.de> wrote:
>> > On 6 Mar 2012, at 22:49, Jason Garrett-Glaser <jason at x264.com> wrote:
>> >> On Tue, Mar 6, 2012 at 1:11 PM, Reimar Döffinger
>> >> <Reimar.Doeffinger at gmx.de> wrote:
>> >>> movq from SSE register _to_ memory is an SSE2 instruction.
>> >>> Use the SSE movlps function instead that does the same thing.
>> >>>
>> >>> Signed-off-by: Reimar Döffinger <Reimar.Doeffinger at gmx.de>
>> >>> ---
>> >>>  libavcodec/x86/sbrdsp.asm |    2 +-
>> >>>  1 files changed, 1 insertions(+), 1 deletions(-)
>> >>>
>> >>> diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm
>> >>> index c165c52..c3b559b 100644
>> >>> --- a/libavcodec/x86/sbrdsp.asm
>> >>> +++ b/libavcodec/x86/sbrdsp.asm
>> >>> @@ -104,7 +104,7 @@ cglobal sbr_hf_g_filt, 5, 6, 5
>> >>>     movq        m2, [r1]
>> >>>     punpckldq   m0, m0
>> >>
>> >> These look pretty SSE2 to me, too.
>> >
>> > Unfortunately that depends on the specific opcode chosen, they all have SSE equivalents after all.
>>
>> What are you talking about?  punpckldq does not have an "SSE equivalent".
>
> unpcklps seems to me to do the same thing?
> While differing in the wording, the descriptions of both instructions
> seem to say exactly the same thing (going after AMD manual).

Oh, that's true, but then why don't you use the right instruction?
punpckldq will not be magically converted by the compiler into
unpcklps.

Using integer instructions on float data is bad to begin with when it
can be avoided, it adds latency on many chips.

Jason


More information about the ffmpeg-devel mailing list