[FFmpeg-devel] [PATCH] unscaled float 2 int conversion
Thu May 15 23:25:30 CEST 2008
On Thu, May 15, 2008 at 11:17:40PM +0200, Michael Niedermayer wrote:
> On Thu, May 15, 2008 at 09:14:15PM +0200, Benjamin Larsson wrote:
> > Michael Niedermayer wrote:
> > >> Well when I tried the last time I did't get it to work, there was some
> > >> overlap issue that wasn't trivial to sort out.
> > >
> > > You just add 384 or what it was after the windowing/overlap.
> > >
> > Just to be clear, this bias scale thing is about not having to use the
> > fstp fpu call or whatever it is called on other cpus. To perform it you
> > first scale down your samples to -1 and 1. This scaling operation is
> > most often performed for free by scaling a suitable table somewhere.
> > Then you add 384 so you can cast the float value directly to an integer.
> > So you trade a float add against fstp which must have been faster on
> > some cpu (or else they wouldn't have used it).
> > In FFmpeg we also have 3dnow, sse and altivec code that can do float to
> > int16 conversion. I think we can agree that the simd code is faster then
> > the bias trick on all processors that supports the simd code. Then we
> > are left with Intel cpus before P3, the Motorola G3 and various other
> > cpus with only fpus and no simd unit. I'm pretty sure that this trick is
> > the best when we are dealing with P2 cpus and lower but I'm not sure it
> > is for the G3.
> > So then we come to the matter of performance, you want benchmarks to
> > justify changing or adding a new scaling method. As I don't have access
> > to any machines that doesn't have a simd unit I can't do any usable
> > benchmarks. But I'm quite sure that if I had access it would show that
> > doing the bias trick would be faster. So one could argue that well ok
> > then we keep the code as it is. But my opinion is that we should scrap
> > this anyway, it makes the code complex, it slows down the simd code
> > (very little though) for no good reason, it complicates the development
> > of a proper audio api and filter system. Cpus with slow fpus should use
> > fixed point code instead.
> > So I propose that we start cleaning out this.
> Ohh well, why do i always have to do the work? You could have safed me
> some time by just saying that you wont do the benchmarks.
> PS: yes i dont give a damn what you or anyone else thinks, either
> i see benchmarks or people can go talking to their next wall.
> It would have taken you less time to disable MMX*/SSE* and write
> a benchmark than explaining why its better not to.
all benchmarks where from a duron not a P3 ill go and get a new brain ...
and retry ...
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel