[FFmpeg-devel] [PATCH v2 2/2] avfilter/interlace: add complex vertical low-pass filter

James Almer jamrial at gmail.com
Thu Apr 13 05:53:56 EEST 2017


On 4/12/2017 9:39 PM, Thomas Mundt wrote:
>>>> Michael Niedermayer <michael at niedermayer.cc> schrieb am Mi, 12.4.2017:
> On Thu, Mar 30, 2017 at 12:21:58AM +0000, Thomas Mundt wrote:
>>>>> Lou Logan <lou at lrcd.com> schrieb am Do, 30.3.2017:
>>>> On Mon, 13 Mar 2017 16:23:46 +0000 (UTC)
>>>> Thomas Mundt <loudmax-at-yahoo.de at ffmpeg.org> wrote:
>>>>
>>>> [...]
>>>>> index 09ca4d3..0b5b858 100644
>>>>> --- a/libavfilter/vf_tinterlace.c
>>>>> +++ b/libavfilter/vf_tinterlace.c
>>>> [...]
>>>>> +static void lowpass_line_complex_c(uint8_t *dstp, ptrdiff_t width, const uint8_t *srcp,
>>>>> +                                   ptrdiff_t mref, ptrdiff_t pref)         
>>>>
>>>> Trailing whitespace should be avoided. It prevents the patch from being
>>>> applied.
>>>
>>> Oh, didn´t notice. Thanks.
>>> New patch set attached.
>>
>> [...]
>>> --- a/libavfilter/x86/vf_interlace.asm
>>> +++ b/libavfilter/x86/vf_interlace.asm
>>> @@ -28,33 +28,28 @@ SECTION_RODATA
>>>  SECTION .text
>>>  
>>>  %macro LOWPASS_LINE 0
>>> -cglobal lowpass_line, 5, 5, 7
>>> -    add r0, r1
>>> -    add r2, r1
> 
> [...]
>>> -    add r1, 2*mmsize
>>> -    jl .loop
>>> +    add dstq, 2*mmsize
>>> +    add srcq, 2*mmsize
>>> +    sub hd, 2*mmsize
>>> +    jg .loop
>>
>> this increases the number of instructions in the inner loop by 2
> 
> James Almer suggested to change the function prototype. Which was easy in c, but for simd this is the best I can do.

I didn't check, but I think the reason i told you to change the prototype here
was to share the function pointer with lowpass_line_complex, so you can do
something like

if (tinterlace->flags & TINTERLACE_FLAG_VLPF)
    tinterlace->lowpass_line = lowpass_line_c;
else if (tinterlace->flags & TINTERLACE_FLAG_CVLPF)
    tinterlace->lowpass_line = lowpass_line_complex_c;

instead of adding a new one to InterlaceContext and TInterlaceContext.
Otherwise you wouldn't really gain much changing the prototype for linear here.

> I asked for help a month ago but get no reply. Can you tell me how to avoid this?

Yes, sorry, i kinda lost track of this since for some reason your emails start
a new thread each instead of showing up as a reply.
You just need to turn mref and pref into the equivalent of the old srcp_above
and srcp_below pointers, like so:

diff --git a/libavfilter/x86/vf_interlace.asm b/libavfilter/x86/vf_interlace.asm
index f70c700965..8a0dd3bdea 100644
--- a/libavfilter/x86/vf_interlace.asm
+++ b/libavfilter/x86/vf_interlace.asm
@@ -28,32 +28,32 @@ SECTION_RODATA
 SECTION .text
 
 %macro LOWPASS_LINE 0
-cglobal lowpass_line, 5, 5, 7
-    add r0, r1
-    add r2, r1
-    add r3, r1
-    add r4, r1
-    neg r1
+cglobal lowpass_line, 5, 5, 7, dst, h, src, mref, pref
+    add dstq, hq
+    add srcq, hq
+    add mrefq, srcq
+    add prefq, srcq
+    neg hq
 
     pcmpeqb m6, m6
 
 .loop:
-    mova m0, [r3+r1]
-    mova m1, [r3+r1+mmsize]
-    pavgb m0, [r4+r1]
-    pavgb m1, [r4+r1+mmsize]
+    mova m0, [mrefq+hq]
+    mova m1, [mrefq+hq+mmsize]
+    pavgb m0, [prefq+hq]
+    pavgb m1, [prefq+hq+mmsize]
     pxor m0, m6
     pxor m1, m6
-    pxor m2, m6, [r2+r1]
-    pxor m3, m6, [r2+r1+mmsize]
+    pxor m2, m6, [srcq+hq]
+    pxor m3, m6, [srcq+hq+mmsize]
     pavgb m0, m2
     pavgb m1, m3
     pxor m0, m6
     pxor m1, m6
-    mova [r0+r1], m0
-    mova [r0+r1+mmsize], m1
+    mova [dstq+hq], m0
+    mova [dstq+hq+mmsize], m1
 
-    add r1, 2*mmsize
+    add hq, 2*mmsize
     jl .loop
 REP_RET
 %endmacro

> 
>> also can you add a fate test for the -1 2 6 2-1 filter ?
> 
> Sure. I never wrote a fate test and I´m off for a couple of days, so this could take some time. Can you give me a hint or an example?
> 
> Regards,
> Thomas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 



More information about the ffmpeg-devel mailing list