[FFmpeg-devel] Review request - ra288.{c,h} ra144.{c,h}

Michael Niedermayer michaelni
Wed Sep 17 02:04:52 CEST 2008


On Wed, Sep 17, 2008 at 02:48:41AM +0300, Siarhei Siamashka wrote:
> On Wednesday 17 September 2008, Michael Niedermayer wrote:
> > On Wed, Sep 17, 2008 at 12:06:31AM +0300, Siarhei Siamashka wrote:
> > > On Tuesday 16 September 2008, Vitor Sessak wrote:
> > > > Siarhei Siamashka wrote:
> >
> > [...]
> >
> > > > > You can try experimenting with compression of a simple signal,
> > > > > something like sine and check if you can see similarities in the
> > > > > input and output files. And then change this signal to something more
> > > > > complicated until the differences start getting more visible.
> > > >
> > > > I'm not sure that it is worth the trouble just to decide on using
> > > > lrintf() or not...
> > >
> > > I see. Anyway, I don't like lrintf, it's slow, not quite portable
> > > (depends on
> >
> > Do you have some benchmark between lrintf() and a int cast that confirms
> > this? (with -O3 -fno-math-errno of course)
> > because last time i checked lrintf() was faster, but of course thats just
> > x86
> 
> Where did I suggest to use int cast instead of lrintf? That's basically all
> the answer.

Then i misunderstood you. The code was using an int cast and i felt you
objected to my suggestion of changing it to lrintf().


> 
> > > global rounding settings) and it is only useful on very old x86 systems
> > > (for which 'ff_float_to_int16_c' exists). Also as far as I know, SIMD
> > > instructions at least from 3DNOW and ARM NEON only efficiently support
> > > conversion with rounding to zero (please correct me if I'm wrong). And
> > > conversion to int with
> >
> > which compiler generates 3dnow or NEON instructions for an int cast?
> > If none then iam not sure how this could be an argument for prefering an
> > int cast.
> >
> > > rounding to zero should be supported well on any hardware designed to be
> > > C language friendly.
> >
> > Well it is not on pre SSE(2) x86 and on post it requires the compiler to
> > generate pure SSE/SSE2 code and not utilize the x87 unit, also binaries
> > compile with sse2 will not run on pre SSE2 (before P4) cpus.
> 
> Well, the summary of my message was the following: "if you want to find an
> easy optimization target, grep ffmpeg sources for lrintf". One of such targets
> is WMA decoder, using 'float_to_int16_interleave' in it is quite trivial and
> provides a very noticeable performance improvement, there were even several
> patches floating around which can be used with minor changes.

yes, and no ....
yes lrintf() is a easy optimization target, but its not
float_to_int16_interleave() that should be used, instead the floats themselfs
should be returned.


> 
> It would not be very nice to use lrintf in new code and dsputil functions
> should be preferred. For the targets (if such targets exist) where lrintf is
> the fastest way to convert float to integer, the dsputil function should be
> implemented using lrintf. Otherwise it should use SIMD instructions available
> on the target system, or unrolled/pipelined sequence of instructions.

as said above, IMO the floats should be returned, but it for some reason didnt
work when vitor tried so i suggested lrintf() until floats can be returned.
dsputil would be IMHO overkill as its not supposed to stay like that when
float output is fixed.
If it was supposed tp stay then yes id agree with you that dsputil should be
used.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I count him braver who overcomes his desires than him who conquers his
enemies for the hardest victory is over self. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080917/4dedc862/attachment.pgp>



More information about the ffmpeg-devel mailing list