[FFmpeg-devel] [PATCH 3/3] Use DSPContext.vector_fmul() and DSPContext.vector_fmul_reverse() in floating-point version of apply_window(). 46% faster in function apply_window().

Michael Niedermayer michaelni
Tue Jan 18 16:42:08 CET 2011

On Wed, Jan 05, 2011 at 04:32:40PM -0500, Justin Ruggles wrote:
> On 01/05/2011 04:06 PM, Loren Merritt wrote:
> > On Tue, 4 Jan 2011, Justin Ruggles wrote:
> > 
> >> Currently we have vector_fmul() for: C, neon, vfp, altivec, 3dnow, sse
> >>
> >> I implemented vector_fmul_copy() for C, altivec, 3dnow, and sse to use 2
> >> src and 1 dst. The Altivec version of vector_fmul_copy() has not been
> >> tested, but I implemented it in the hope that someone else will test and
> >> review it.  Here are some benchmarks on my Athlon64. benchmark numbers
> >> are in dezicycles.
> >>
> >> I also tried to rewrite the current C version in SSE.  It was faster
> >> than the fmul_copy+fmul_reverse since it basically merges the 2 loops,
> >> but it was slower than vector_fmul_copy(512).  I left that out of the
> >> patch.  If anyone is interested I can send it...
> > 
> > I predict that all of the vector_fmul_* mentioned here are memory-bound on 
> > intel and arithmetic-bound on amd.
> > 
> > Is there any reason to keep both the 2-arg and 3-arg version of 
> > vector_fmul?
> I tested using vector_fmul_copy with same value for src0 and dst and it
> ended up being slower.  I thought it was weird, so I kept both versions.
>  Maybe I did something wrong in my tests though...
> Also, I'll try benchmarking these on my laptop (Intel Atom 330, 64-bit
> Ubuntu).

Is there a patch i should review left in this thread or should i be waiting
for a new one?

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Asymptotically faster algorithms should always be preferred if you have
asymptotical amounts of data
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110118/61b0e438/attachment.pgp>

More information about the ffmpeg-devel mailing list