[FFmpeg-devel] [PATCH] JPEG2000: SSE optimisation of DWT decoding

maxime taisant maximetaisant at hotmail.fr
Fri Aug 11 15:47:24 EEST 2017


> From: Maxime Taisant <maximetaisant at hotmail.fr>
> 
> > From: Ivan Kalvachev <ikalvachev at gmail.com>
> >
> > On 8/8/17, maxime taisant <maximetaisant at hotmail.fr> wrote:
> > > From: Maxime Taisant <maximetaisant at hotmail.fr>
> > >
> > > +    movups m2, [lineq+2*j0q-24]
> > > +    movups m5, [lineq+2*j0q-8]
> > > +    shufps m2, m5, 0xDD
> > > +    addps m2, m1
> > > +    mulps m2, m3
> > > +    subps m0, m2
> > > +    movups m4, m1
> > > +    shufps m1, m0, 0x44 ; 0100'0100 q1010
> > Is that movlhps m1, m0 ?
> 
> No, this command place the first two values of m1 in the last two
> doublewords of m1, and the first two values of m0 in the first two
> doublewords of m1.
> Movhlps would simply replace the first two values of m1 by the ones
> of m0.
> 

Ok, so everything I said here is wrong... 
You were right, that IS movlhps m1, m0. 
Will change that, sorry.



More information about the ffmpeg-devel mailing list