[FFmpeg-devel] [PATCH 1/9] SBR DSP x86: implement SSE qmf_pre_shuffle
christophe.gisquet at gmail.com
Sat Apr 6 15:09:19 CEST 2013
But I can understand the reasoning. I have already seen such behaviour of
complex addressing slowing things noticeably. I am not knowledgeable enough
to predict them so I just test them at random.
Le 6 avr. 2013 14:49, "Michael Niedermayer" <michaelni at gmx.at> a écrit :
> On Sat, Apr 06, 2013 at 11:26:54AM +0200, Christophe Gisquet wrote:
> > 2013/4/5 Michael Niedermayer <michaelni at gmx.at>:
> > > using simpler memory indexing ([r2q + n*mmsize] and [zq])
> > > and incremeanting them seperately seems 1-2 cpu cycles faster here
> > In general, and most particularly here, could you provide whatever
> > form (except machine code ;) of the code you tested?
> it seems the speed gain of this change depends on using the slower
> SSE variants of the instructions.
> I suspect the extra complexity of 2 seperate ways of indexing isnt
> worth the gain
> Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> You can kill me, but you cannot change the truth.
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
More information about the ffmpeg-devel