[FFmpeg-devel] [PATCH] av_filter/x86/idet: MMX/SSE2 implementation of 16bits filter_line()

Pascal Massimino pascal.massimino at gmail.com
Tue Sep 9 21:58:31 CEST 2014


James,

On Tue, Sep 9, 2014 at 10:31 AM, James Almer <jamrial at gmail.com> wrote:

> On 09/09/14 9:52 AM, Pascal Massimino wrote:
> > +    mova      m2, m_sum
> > +%if mmsize == 16
> > +    psrldq    m2, 4
> > +    paddd     m_sum, m2
> > +    psrldq    m2, 4
> > +    paddd     m_sum, m2
> > +    psrldq    m2, 4
> > +    paddd     m_sum, m2
> > +%else
> > +    psrlq     m2, 32
> > +    paddd     m_sum, m2
> > +%endif
>
> The SSE2 version is using three instructions more than necessary here.
> You could use the HADDD macro to replace the code above, which expands
> to a more optimized SSE2 version.
>
> And now that i check the old stuff again, you could also use it in the
> IDET_FILTER_LINE macro. It will be one less instruction for the mmxext
> version.
>

oh, right! let me send you a patch for that...


More information about the ffmpeg-devel mailing list