[FFmpeg-devel] [PATCH] avfilter/vf_interlace: add complex vertcal lowpassfilter

Sat Mar 11 17:39:07 EET 2017

> James Almer <jamrial at gmail.com> schrieb am Fr, 10.3.2017:
>> On 3/8/2017 1:58 PM, Thomas Mundt wrote:
>> Hi,
>> 
>> attached patch adds a complex (-1 2 6 2 -1) vertcal lowpassfilter to vf_interlace. This will better retain detail and reduce blurring compared to the existing (1 2 1) filter.
>> 
>> Please comment.
>> diff --git a/libavfilter/interlace.h b/libavfilter/interlace.h
>> index da073ae..7ad457e 100644
>> --- a/libavfilter/interlace.h
>> +++ b/libavfilter/interlace.h
>> @@ -51,6 +51,8 @@ typedef struct InterlaceContext {
>>      AVFrame *cur, *next;   // the two frames from which the new one is obtained
>>      void (*lowpass_line)(uint8_t *dstp, ptrdiff_t linesize, const uint8_t *srcp,
>>                           const uint8_t *srcp_above, const uint8_t *srcp_below);
>> +    void (*lowpass_line_complex)(uint8_t *dstp, ptrdiff_t linesize,
>> +                                 const uint8_t *srcp, int mref, int pref);
> 
> Why not keep a single prototype, passing mref and pref for both linear
> and complex? You can calculate srcp_above and srcp_below for linear like
> you're doing it for complex in both the c and asm versions.
> 

This would make sense. I just didn´t wanted to change more than necessary in one patch and I´m not such an expert in simd programming.
This is my second attempt ever.
Also there is the same function in tinterlace filter that also uses this asm. So I would also need to change the function in tinterlace.
I could do all this as a subsequent patch. Do you have a suggestion how to deal with the option name of complex filter in tinterlace?
There it´s a flag with the names "low_pass_filter" and "vlpf", so I could add "complex_low_pass_filter" and "vclpf"?

> In any case, mref and pref should be ptrdiff_t and not int.

OK, I´ll change this in patch v2.

> 
>>  } InterlaceContext;
>>  
>>  void ff_interlace_init_x86(InterlaceContext *interlace);
>> diff --git a/libavfilter/vf_interlace.c b/libavfilter/vf_interlace.c
>> index efa3128..e8d5de4 100644
>> --- a/libavfilter/vf_interlace.c
>> +++ b/libavfilter/vf_interlace.c
>> @@ -47,7 +47,7 @@ static const AVOption interlace_options[] = {
>>      { "bff", "bottom field first", 0,
>>          AV_OPT_TYPE_CONST, {.i64 = MODE_BFF }, INT_MIN, INT_MAX, .flags = FLAGS, .unit = "scan" },
>>      { "lowpass", "set vertical low-pass filter", OFFSET(lowpass),
>> -        AV_OPT_TYPE_BOOL,  {.i64 = 1 },        0, 1, .flags = FLAGS },
>> +        AV_OPT_TYPE_INT,   {.i64 = 1 },        0, 2, .flags = FLAGS },
> 
> Maybe add AV_OPT_TYPE_CONST values "off", "linear" and "complex".

OK

>>      { NULL }
>>  };
>>  
>> @@ -67,6 +67,24 @@ static void lowpass_line_c(uint8_t *dstp, ptrdiff_t linesize,
>>      }
>> +%endmacro
>> +
>> +%macro LOWPASS_LINE_COMPLEX 0
>> +cglobal lowpass_line_complex, 3, 5, 7, dst, h, src, mref, pref
> 
> Make this 5, 5, 7 and remove the two movs below.

OK, I´ll change this in patch v2.

>> +    mov r3, mrefmp
>> +    mov r4, prefmp
>> +
>> +    pxor m6, m6
>> +.loop:
>> +    mova m0, [srcq+r3]
> 
> [srcq+mrefq]

OK

>> +    mova m2, [srcq+r4]
>
> [srcq+prefq]

OK

>> +    mova m1, m0
>> +    mova m3, m2
>> +    punpcklbw m0, m6
>> +    punpcklbw m2, m6
>> +    punpckhbw m1, m6
>> +    punpckhbw m3, m6
>> +    paddw m0, m2
>> +    paddw m1, m3
>> +    mova m2, [srcq+r3*2]
> 
> [srcq+mrefq*2]

OK

>> +    mova m4, [srcq+r4*2]

> [srcq+prefq*2]

OK

>> +    add dstq, mmsize
>> +    add srcq, mmsize
>> +    sub DWORD hm, mmsize
> 
> sub hd, mmsize

OK

>> +
>> +INIT_XMM sse2
>> +LOWPASS_LINE_COMPLEX
>> +
>> +INIT_XMM avx
>> +LOWPASS_LINE_COMPLEX
> 
> There's no reason to add an AVX version. You're not taking advantage of
> the 3 operand instruction format, or any of the new instructions, or the
> 32 byte regs.
> 
> If you merge some of the movas with punpck* instructions and can notice
> a difference when benching, then AVX would make sense.
> 

OK. Speed improvement is marginal so I´ll drop AVX version in patch v2

>> diff --git a/libavfilter/x86/vf_interlace_init.c b/libavfilter/x86/vf_interlace_init.c
>> index 52a22f8..947ff9a 100644
>> --- a/libavfilter/x86/vf_interlace_init.c
>> +++ b/libavfilter/x86/vf_interlace_init.c
>> @@ -35,12 +35,21 @@ void ff_lowpass_line_avx (uint8_t *dstp, ptrdiff_t linesize,
>>                            const uint8_t *srcp_above,
>>                            const uint8_t *srcp_below);
>>