[FFmpeg-devel] [PATCH] yadif port to libavfitler

Michael Niedermayer michaelni
Sun May 23 12:53:19 CEST 2010


On Thu, May 20, 2010 at 07:52:31PM -0700, Baptiste Coudurier wrote:
> On 05/20/2010 01:37 AM, Baptiste Coudurier wrote:
>> Hi guys,
>>
>> Here is my first attempt to port yadif to libavfilter.
>>
>> I chose the request_frame paradigm since the filter has delay.
>> What happens when no prev frame is available ? Original code seems to
>> read from

read from?
(it obviously should use spatial/directional interpolation for the very
 first frame)


>>
>> I'd like extensive review if possible and especially instructions on how
>> to retrieve cpu flags at runtime for MMX.

hmm, i guess we need to move libavcodec/x86/cpuid.c to libavutil


>>
>> Also Michael, can you please check the calls to filter() ?

what do you want me to check exactly?


>> I'm also not sure how the field output mode should be supported.

for every 2nd request_frame() we would just call our sources request_frame()
(the variant with calling 2 start/slice/end in one request_frame() might work
 too but i thinks its less ideal)


>
> Patch updated, offseting the buffers as needed (please double check), and 
> memleak free :>
> Questions still stands :)
> The filter works without copying any input picture which is nice.

does md5 match to mplayers ?


[...]
> +static void filter_line_mmx2(AVFilterContext *ctx, uint8_t *dst,
> +                             uint8_t *prev, uint8_t *cur, uint8_t *next,
> +                             int w, int refs, int parity)
> +{
> +    YADIFContext *yadif = ctx->priv;
> +    static const uint64_t pw_1 = 0x0001000100010001ULL;
> +    static const uint64_t pb_1 = 0x0101010101010101ULL;
> +    const int mode = yadif->mode;
> +    uint64_t tmp0, tmp1, tmp2, tmp3;
> +    int x;
> +
> +#define FILTER\
> +    for(x=0; x<w; x+=4){\
> +        __asm__ volatile(\
> +            "pxor      %%mm7, %%mm7 \n\t"\
> +            LOAD4("(%[cur],%[mrefs])", %%mm0) /* c = cur[x-refs] */\
> +            LOAD4("(%[cur],%[prefs])", %%mm1) /* e = cur[x+refs] */\
> +            LOAD4("(%["prev2"])", %%mm2) /* prev2[x] */\
> +            LOAD4("(%["next2"])", %%mm3) /* next2[x] */\
> +            "movq      %%mm3, %%mm4 \n\t"\
> +            "paddw     %%mm2, %%mm3 \n\t"\
> +            "psraw     $1,    %%mm3 \n\t" /* d = (prev2[x] + next2[x])>>1 */\
> +            "movq      %%mm0, %[tmp0] \n\t" /* c */\
> +            "movq      %%mm3, %[tmp1] \n\t" /* d */\
> +            "movq      %%mm1, %[tmp2] \n\t" /* e */\
> +            "psubw     %%mm4, %%mm2 \n\t"\
> +            PABS(      %%mm4, %%mm2) /* temporal_diff0 */\
> +            LOAD4("(%[prev],%[mrefs])", %%mm3) /* prev[x-refs] */\
> +            LOAD4("(%[prev],%[prefs])", %%mm4) /* prev[x+refs] */\
> +            "psubw     %%mm0, %%mm3 \n\t"\
> +            "psubw     %%mm1, %%mm4 \n\t"\
> +            PABS(      %%mm5, %%mm3)\
> +            PABS(      %%mm5, %%mm4)\
> +            "paddw     %%mm4, %%mm3 \n\t" /* temporal_diff1 */\
> +            "psrlw     $1,    %%mm2 \n\t"\
> +            "psrlw     $1,    %%mm3 \n\t"\
> +            "pmaxsw    %%mm3, %%mm2 \n\t"\
> +            LOAD4("(%[next],%[mrefs])", %%mm3) /* next[x-refs] */\
> +            LOAD4("(%[next],%[prefs])", %%mm4) /* next[x+refs] */\
> +            "psubw     %%mm0, %%mm3 \n\t"\
> +            "psubw     %%mm1, %%mm4 \n\t"\
> +            PABS(      %%mm5, %%mm3)\
> +            PABS(      %%mm5, %%mm4)\
> +            "paddw     %%mm4, %%mm3 \n\t" /* temporal_diff2 */\
> +            "psrlw     $1,    %%mm3 \n\t"\
> +            "pmaxsw    %%mm3, %%mm2 \n\t"\
> +            "movq      %%mm2, %[tmp3] \n\t" /* diff */\
                                 ^^^^^^
code like this does not wor with all gcc versions, (it needs a equivalent of
the NAMED_ASM_ARGS from mplayer


[...]
> +
> +static void filter(AVFilterContext *ctx, AVFilterPicRef *dstpic,
> +                   int parity, int tff)
> +{
> +    YADIFContext *yadif = ctx->priv;
> +    int y, i;
> +
> +    for (i = 0; i < 3; i++) {
> +        int is_chroma = !!i;
> +        int w = dstpic->w >> is_chroma;
> +        int h = dstpic->h >> is_chroma;
> +        int refs = yadif->cur->linesize[i];
> +
> +        for (y = 0; y < h; y++) {
> +            if ((y ^ parity) & 1) {
> +                uint8_t *prev = &yadif->prev->data[i][y*refs];
> +                uint8_t *cur  = &yadif->cur ->data[i][y*refs];
> +                uint8_t *next = &yadif->next->data[i][y*refs];
> +                uint8_t *dst  = &dstpic->data[i][y*dstpic->linesize[i]];
> +                yadif->filter_line(ctx, dst, prev, cur, next, w, refs, parity ^ tff);
> +            } else {
> +                memcpy(&dstpic->data[i][y*dstpic->linesize[i]],
> +                       &yadif->cur->data[i][y*refs], w);

switching from fast_memcpy() (that works with SIMD) to memcpy() possibly looses speed


[...]
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    static const enum PixelFormat pix_fmts[] = {
> +        PIX_FMT_YUV420P,
> +        PIX_FMT_GRAY8,
> +        PIX_FMT_NONE
> +    };

it should be trivial to support other yuv formats

other improvments that could be tried once its all in svn are to
try to pass draw_slice() calls through from the source and maybe to
pass get_video_buffer() through to avoid the every 2nd line memcpy.

also if bobby could take a look at this patch it would be quite good
i likely miss some issues.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Freedom in capitalist society always remains about the same as it was in
ancient Greek republics: Freedom for slave owners. -- Vladimir Lenin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100523/33efbb19/attachment.pgp>



More information about the ffmpeg-devel mailing list