[FFmpeg-devel] port mplayer eq filter to libavfilter

Ronald S. Bultje rsbultje
Mon Nov 29 17:16:05 CET 2010


Hi,

On Mon, Nov 29, 2010 at 10:57 AM, William Yu <genwillyu at gmail.com> wrote:
> 2010/11/26 Ronald S. Bultje <rsbultje at gmail.com>:
>> On Fri, Nov 26, 2010 at 9:38 AM, William Yu <genwillyu at gmail.com> wrote:
>>> 2010/11/25 Ronald S. Bultje <rsbultje at gmail.com> wrote:
>>>> On Thu, Nov 25, 2010 at 4:27 AM, William Yu <genwillyu at gmail.com> wrote:
>>>>> + ? ? ? ?for (i = w&7; i; i--) {
>>>>> + ? ? ? ? ? ?pel = ((*line* contrast)>>12) + brightness;
>>>>> + ? ? ? ? ? ?if (pel&768) pel = (-pel)>>31;
>>>>> + ? ? ? ? ? ?*line++ = pel;
>>>>> + ? ? ? ?}
>>>>
>>>> Please don't mix and match C and ASM, this takes about 10-20 lines in
>>>> asm, if you want you can even compile it using gcc and directly copy
>>>> it (see above, use FASTDIV also). That will prevent it from using eax
>>>> and then the function gets faster.
> I have replace c code to assembly. anything else?

Better...

> +    brvec[0] = brvec[1] = brvec[2] = brvec[3] = brightness;
> +    contvec[0] = contvec[1] = contvec[2] = contvec[3] = contrast;
> +
> +    __asm__ volatile (
> +        "movq (%4), %%mm3 \n\t"
> +        "movq (%5), %%mm4 \n\t"

movd %4, %%mm3
movd %5, %%mm4

where 4=brightness and 5=contrast

Then (mmx2; ignore this and I'll do it for you if you don't see how to):

pshufw 0x0, %%mm3, %%mm3
pshufw 0x0, %%mm4, %%mm4

or (mmx):

punpcklwd %%mm3, %%mm3
punpcklwd %%mm4, %%mm4
punpckldq %%mm3, %%mm3
punpckldq %%mm4, %%mm4

Is several cycles faster, and now you don't need brvec/contvec
anymore, saving you two asm arguments, which makes it more likely to
compile on systems such as OSX.

> +        "movl %3, %%ecx \n\t"
> +        "andl $7, %%ecx \n\t"
> +        "cmpl $0, %%ecx \n\t"

andl sets the ZF, you don't need the cmpl.

> +        "addl %7, %%eax \n\t"
> +        "movl %%eax, %%edx \n\t"
> +        "andl $768, %%eax \n\t"
> +        "testl %%eax, %%eax \n\t"

Same.

> +        : "=r" (line), "=m" (h)
> +        : "0" (line), "r" (w), "r" (brvec), "r" (contvec), "m" (step), "m" (brightness), "m" (contrast)

%2 is unused, you should remove it.

Ronald



More information about the ffmpeg-devel mailing list