[FFmpeg-devel] [PATCH] SPARC VIS simple_idct try#7

Balatoni Denes dbalatoni
Wed Aug 29 23:37:19 CEST 2007


Hi!

Wednesday 29 August 2007 00:13-kor Michael Niedermayer ezt ?rta:
> On Tue, Aug 28, 2007 at 10:38:23PM +0200, Balatoni Denes wrote:
> > > > you are forgetting that theres also 25% between the horizontal and
> > > > vertical idcts which can be reused with no store/load and no changes
> > > > to the registers
> > >
> > > Indeed, I didn't take that into account. So if I fix that 25% and the
> > > clamping part, will you accept the patch?
> >
> > Better yet: that would be 4 instructions. How about I gain 4 clocks in
> > some other way instead - how, let it be my secret. Okay?
>
> hmm no but you have to do that secret optimization too now at minimum for
> it to be considered for svn
>
> let me remind you, code has to be optimal to be accepted
>
> ill investigate the register shortage vs. avoidable load/stores vs. latency
> after (the unlikely) case that you do correct the undisputed
> suboptimalities

Here is a new patch. I fixed all "undisputed suboptimalities". I also 
elminitad many adds, as you suggested before, because I found that gcc 
optimized away all unneeded prologue and epilogue code around the asm block. 
I also eliminated the temporary 128 byte storage where it is not needed.

> [...]

bye
Denes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simple_idct_vis_try7.diff
Type: text/x-diff
Size: 21337 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070829/d16cd646/attachment.diff>



More information about the ffmpeg-devel mailing list