[FFmpeg-devel] [PATCH] SPARC VIS simple_idct try#7
Wed Aug 29 23:37:19 CEST 2007
Wednesday 29 August 2007 00:13-kor Michael Niedermayer ezt ?rta:
> On Tue, Aug 28, 2007 at 10:38:23PM +0200, Balatoni Denes wrote:
> > > > you are forgetting that theres also 25% between the horizontal and
> > > > vertical idcts which can be reused with no store/load and no changes
> > > > to the registers
> > >
> > > Indeed, I didn't take that into account. So if I fix that 25% and the
> > > clamping part, will you accept the patch?
> > Better yet: that would be 4 instructions. How about I gain 4 clocks in
> > some other way instead - how, let it be my secret. Okay?
> hmm no but you have to do that secret optimization too now at minimum for
> it to be considered for svn
> let me remind you, code has to be optimal to be accepted
> ill investigate the register shortage vs. avoidable load/stores vs. latency
> after (the unlikely) case that you do correct the undisputed
Here is a new patch. I fixed all "undisputed suboptimalities". I also
elminitad many adds, as you suggested before, because I found that gcc
optimized away all unneeded prologue and epilogue code around the asm block.
I also eliminated the temporary 128 byte storage where it is not needed.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 21337 bytes
Desc: not available
More information about the ffmpeg-devel