[FFmpeg-devel] [patch][OpenHEVC]added ASM functions for epel + qpel
Ronald S. Bultje
rsbultje at gmail.com
Thu Mar 6 03:48:54 CET 2014
On Wed, Mar 5, 2014 at 5:11 AM, Pierre Edouard Lepere <
Pierre-Edouard.Lepere at insa-rennes.fr> wrote:
> >Or just use a C wrapper and avoid all this extra cruft in the assembly. I
> >think for fullpel MC, it's fine to do fullwidth in assembly, but do use it
> >to its fullest extent.
> I think it's more about the limit of 128 registers. Once I can use AVX2,
> I'll be able to use bigger steps. what do you suggest in fullpel MC to use
> the assembly "to the fullest extent" ?
> I've also tried adding the shift into the macro, but to no avail. :/
what I meant is that for this particular piece of code, I sort-of got the
feeling you were trying to make the assembly fit the macros or dsp function
interface, an as a result, you get extra instructions that can be prevented
and - eventually - slower code. Instead, try to generate a raw piece of
assembly that is as small, compact and fast as possible. Then write the
macros to generate exactly that code - not the other way around.
(Also, can you merge all the follow-up patches with the original one into a
single one? It'd make for much simpler review.)
More information about the ffmpeg-devel