[FFmpeg-devel] [PATCH 0/3] synth filter float ASM
christophe.gisquet at gmail.com
Sat Mar 1 16:07:19 CET 2014
2014-03-01 9:35 GMT+01:00 James Almer <jamrial at gmail.com>:
> I didn't notice a performance hit from those extra movaps, but if you or others do
> then maybe it's better to keep both versions.
I have never experimented this myself, but I've seen comments to the
tune that mixing float/integer ops could cause some delays. On the
contrary, integer ops seemed to be speedier (more ports?) for me. I
haven't benchmarked your code, and won't be able until next week. But
that's not worth holding on, it's good to go imo.
> Personally i was expecting a bigger boost than that, considering the main loop is
> being run only once in x64 and twice in x86, compared to two and four times
> respectively with SSE2. But i guess things aren't as linear as i thought.
There is an awful amount of ugly book-keeping for the loops. Brighter
people than me may have simpler solutions to it.
I don't know if this book-keeping (the pseudo variable/reg i) is still
entirely needed for 256b regs, but nothing seems to be ifdef'ed by
your changes. So I guess a non-neglible part is still wasted in them.
More information about the ffmpeg-devel