[FFmpeg-devel] [PATCH] avfilter/vf_interlace: add complex vertcal lowpassfilter
Thomas Mundt
loudmax at yahoo.de
Sat Mar 11 17:39:07 EET 2017
> James Almer <jamrial at gmail.com> schrieb am Fr, 10.3.2017:
>> On 3/8/2017 1:58 PM, Thomas Mundt wrote:
>> Hi,
>>
>> attached patch adds a complex (-1 2 6 2 -1) vertcal lowpassfilter to vf_interlace. This will better retain detail and reduce blurring compared to the existing (1 2 1) filter.
>>
>> Please comment.
>> diff --git a/libavfilter/interlace.h b/libavfilter/interlace.h
>> index da073ae..7ad457e 100644
>> --- a/libavfilter/interlace.h
>> +++ b/libavfilter/interlace.h
>> @@ -51,6 +51,8 @@ typedef struct InterlaceContext {
>> AVFrame *cur, *next; // the two frames from which the new one is obtained
>> void (*lowpass_line)(uint8_t *dstp, ptrdiff_t linesize, const uint8_t *srcp,
>> const uint8_t *srcp_above, const uint8_t *srcp_below);
>> + void (*lowpass_line_complex)(uint8_t *dstp, ptrdiff_t linesize,
>> + const uint8_t *srcp, int mref, int pref);
>
> Why not keep a single prototype, passing mref and pref for both linear
> and complex? You can calculate srcp_above and srcp_below for linear like
> you're doing it for complex in both the c and asm versions.
>
This would make sense. I just didn´t wanted to change more than necessary in one patch and I´m not such an expert in simd programming.
This is my second attempt ever.
Also there is the same function in tinterlace filter that also uses this asm. So I would also need to change the function in tinterlace.
I could do all this as a subsequent patch. Do you have a suggestion how to deal with the option name of complex filter in tinterlace?
There it´s a flag with the names "low_pass_filter" and "vlpf", so I could add "complex_low_pass_filter" and "vclpf"?
> In any case, mref and pref should be ptrdiff_t and not int.
OK, I´ll change this in patch v2.
>
>> } InterlaceContext;
>>
>> void ff_interlace_init_x86(InterlaceContext *interlace);
>> diff --git a/libavfilter/vf_interlace.c b/libavfilter/vf_interlace.c
>> index efa3128..e8d5de4 100644
>> --- a/libavfilter/vf_interlace.c
>> +++ b/libavfilter/vf_interlace.c
>> @@ -47,7 +47,7 @@ static const AVOption interlace_options[] = {
>> { "bff", "bottom field first", 0,
>> AV_OPT_TYPE_CONST, {.i64 = MODE_BFF }, INT_MIN, INT_MAX, .flags = FLAGS, .unit = "scan" },
>> { "lowpass", "set vertical low-pass filter", OFFSET(lowpass),
>> - AV_OPT_TYPE_BOOL, {.i64 = 1 }, 0, 1, .flags = FLAGS },
>> + AV_OPT_TYPE_INT, {.i64 = 1 }, 0, 2, .flags = FLAGS },
>
> Maybe add AV_OPT_TYPE_CONST values "off", "linear" and "complex".
OK
>> { NULL }
>> };
>>
>> @@ -67,6 +67,24 @@ static void lowpass_line_c(uint8_t *dstp, ptrdiff_t linesize,
>> }
>> +%endmacro
>> +
>> +%macro LOWPASS_LINE_COMPLEX 0
>> +cglobal lowpass_line_complex, 3, 5, 7, dst, h, src, mref, pref
>
> Make this 5, 5, 7 and remove the two movs below.
OK, I´ll change this in patch v2.
>> + mov r3, mrefmp
>> + mov r4, prefmp
>> +
>> + pxor m6, m6
>> +.loop:
>> + mova m0, [srcq+r3]
>
> [srcq+mrefq]
OK
>> + mova m2, [srcq+r4]
>
> [srcq+prefq]
OK
>> + mova m1, m0
>> + mova m3, m2
>> + punpcklbw m0, m6
>> + punpcklbw m2, m6
>> + punpckhbw m1, m6
>> + punpckhbw m3, m6
>> + paddw m0, m2
>> + paddw m1, m3
>> + mova m2, [srcq+r3*2]
>
> [srcq+mrefq*2]
OK
>> + mova m4, [srcq+r4*2]
> [srcq+prefq*2]
OK
>> + add dstq, mmsize
>> + add srcq, mmsize
>> + sub DWORD hm, mmsize
>
> sub hd, mmsize
OK
>> +
>> +INIT_XMM sse2
>> +LOWPASS_LINE_COMPLEX
>> +
>> +INIT_XMM avx
>> +LOWPASS_LINE_COMPLEX
>
> There's no reason to add an AVX version. You're not taking advantage of
> the 3 operand instruction format, or any of the new instructions, or the
> 32 byte regs.
>
> If you merge some of the movas with punpck* instructions and can notice
> a difference when benching, then AVX would make sense.
>
OK. Speed improvement is marginal so I´ll drop AVX version in patch v2
>> diff --git a/libavfilter/x86/vf_interlace_init.c b/libavfilter/x86/vf_interlace_init.c
>> index 52a22f8..947ff9a 100644
>> --- a/libavfilter/x86/vf_interlace_init.c
>> +++ b/libavfilter/x86/vf_interlace_init.c
>> @@ -35,12 +35,21 @@ void ff_lowpass_line_avx (uint8_t *dstp, ptrdiff_t linesize,
>> const uint8_t *srcp_above,
>> const uint8_t *srcp_below);
>>
More information about the ffmpeg-devel
mailing list