[FFmpeg-devel] [PATCH] lavu/x86/lls: add fma3 optimizations for update_lls

Ganesh Ajjanagadde gajjanag at mit.edu
Thu Jan 14 15:12:18 CET 2016


On Thu, Jan 14, 2016 at 5:02 AM, Henrik Gramner <henrik at gramner.com> wrote:
> Use the x86inc syntax for FMA instructions (basically FMA4 syntax that
> gets assembled as FMA3) since normal FMA3 opcodes are horrible to
> read, nobody ever remembers the ordering of operands.

1. It is very easy to remember: take fmadd231pd x, y, z for instance.
This means 2*3 + 1, so x = y*z+x. How the macro is more readable is
beyond me; especially with some side cases that are undocumented, see
below.
2. If anything, the macro is harder, since it is not Intel supported,
I can't look it up at
https://www-ssl.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf.
3. The macro does not seem to take care of the mov's (if any), still
requiring explicit thought on the part of the programmer.
4. The macro lacks documentation. In particular, it is not a thorough
fma4 emulation in the spirit of
https://gist.github.com/rygorous/22180ced9c7a00bd68dd.

Or put in other words, IMO not good.

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list