[FFmpeg-devel] [PATCH] av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line()

James Almer jamrial at gmail.com
Tue Sep 9 19:31:32 CEST 2014


On 09/09/14 9:52 AM, Pascal Massimino wrote:
> +    mova      m2, m_sum
> +%if mmsize == 16
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +    psrldq    m2, 4
> +    paddd     m_sum, m2
> +%else
> +    psrlq     m2, 32
> +    paddd     m_sum, m2
> +%endif

The SSE2 version is using three instructions more than necessary here.
You could use the HADDD macro to replace the code above, which expands 
to a more optimized SSE2 version.

And now that i check the old stuff again, you could also use it in the 
IDET_FILTER_LINE macro. It will be one less instruction for the mmxext 
version.


More information about the ffmpeg-devel mailing list