[FFmpeg-devel] [PATCH] Add alphaextract, alphamerge filters

Wed Jul 11 15:43:09 CEST 2012

On date Tuesday 2012-07-10 22:41:06 -0700, Steven Robertson encoded:
> Hi all,
> 
> I need to transmit a bunch of frames with a real-valued alpha channel
> (not just a static mask) over a relatively slow link, and since the
> recipient is on Windows, I'd like the relative simplicity of having
> FFmpeg be the only tool that he needs to recover the content. A bit of
> searching seems to indicate I'm not the first to want this, so I wrote
> up a simple pair of filters to make this possible. From the patch
> description:
> 
> "These filters are designed for storing and transmitting video sequences
> with alpha using higher-efficiency codecs such as x264 which don't
> natively support an alpha channel. 'alphaextract' takes an input stream
> with an alpha channel and returns a video containing just the alpha
> component as a grayscale value; 'alphamerge' takes an RGB or YUV stream
> and adds an alpha channel recovered from a second grayscale stream."
> 
> Happy to address any review comments.
> 
> Thanks,
> Steve

> From c2709fecad9bbbad927f679fa2af6b17f9b2e4fa Mon Sep 17 00:00:00 2001
> From: Steven Robertson <steven at strobe.cc>
> Date: Tue, 10 Jul 2012 22:14:57 -0700
> Subject: [PATCH] Add alphaextract, alphamerge filters.
> 
> These filters are designed for storing and transmitting video sequences
> with alpha using higher-efficiency codecs such as x264 which don't
> natively support an alpha channel. 'alphaextract' takes an input stream
> with an alpha channel and returns a video containing just the alpha
> component as a grayscale value; 'alphamerge' takes an RGB or YUV stream
> and adds an alpha channel recovered from a second grayscale stream.
> 
> Signed-off-by: Steven Robertson <steven at strobe.cc>
> ---
>  doc/filters.texi                 |  24 +++++
>  libavfilter/Makefile             |   2 +
>  libavfilter/allfilters.c         |   2 +
>  libavfilter/vf_alphaextract.c    | 108 +++++++++++++++++++++++
>  libavfilter/vf_alphamerge.c      | 186 +++++++++++++++++++++++++++++++++++++++
>  tests/lavfi-regression.sh        |  15 +++-
>  tests/ref/lavfi/alphaextract_rgb |   1 +
>  tests/ref/lavfi/alphaextract_yuv |   1 +
>  tests/ref/lavfi/alphamerge_rgb   |   1 +
>  tests/ref/lavfi/alphamerge_yuv   |   1 +
>  10 files changed, 338 insertions(+), 3 deletions(-)
>  create mode 100644 libavfilter/vf_alphaextract.c
>  create mode 100644 libavfilter/vf_alphamerge.c
>  create mode 100644 tests/ref/lavfi/alphaextract_rgb
>  create mode 100644 tests/ref/lavfi/alphaextract_yuv
>  create mode 100644 tests/ref/lavfi/alphamerge_rgb
>  create mode 100644 tests/ref/lavfi/alphamerge_yuv
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 0d94eba..d9473b4 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -1016,6 +1016,30 @@ build.
>  
>  Below is a description of the currently available video filters.
>  
> + at section alphaextract
> +
> +Extract the alpha component from the input as a grayscale video. This
> +is especially useful with the @var{alphamerge} filter.

I suppose this could be made more general (compextract?), but I
don't object the inclusion of this.

> +
> + at section alphamerge
> +
> +Add or replace the alpha component of the primary input with the
> +grayscale value of a second input. This is intended for use with
> + at var{alphaextract} to allow the transmission or storage of frame
> +sequences that have alpha in a format that doesn't support an alpha
> +channel.
> +
> +For example, to reconstruct full frames from a normal YUV-encoded video
> +and a separate video created with @var{alphaextract}, you might use:
> + at example
> +movie=in_alpha.mkv [alpha]; [in][alpha] alphamerge [out]
> + at end example
> +
> +Since this filter is designed for reconstruction, it operates on frame
> +sequences without considering timestamps, and terminates when either
> +input reaches end of stream. If you're trying to apply an image as an
> +overlay to a video stream, consider the @var{overlay} filter instead.
> +
>  @section ass
>  
>  Draw ASS (Advanced Substation Alpha) subtitles on top of input video
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index b094f59..a177752 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -76,6 +76,8 @@ OBJS-$(CONFIG_ABUFFERSINK_FILTER)            += sink_buffer.o
>  OBJS-$(CONFIG_ANULLSINK_FILTER)              += asink_anullsink.o
>  
>  OBJS-$(CONFIG_ASS_FILTER)                    += vf_ass.o
> +OBJS-$(CONFIG_ALPHAEXTRACT_FILTER)           += vf_alphaextract.o
> +OBJS-$(CONFIG_ALPHAMERGE_FILTER)             += vf_alphamerge.o
>  OBJS-$(CONFIG_BBOX_FILTER)                   += bbox.o vf_bbox.o
>  OBJS-$(CONFIG_BLACKDETECT_FILTER)            += vf_blackdetect.o
>  OBJS-$(CONFIG_BLACKFRAME_FILTER)             += vf_blackframe.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 706405e..aad4534 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -64,6 +64,8 @@ void avfilter_register_all(void)
>      REGISTER_FILTER (ABUFFERSINK, abuffersink, asink);
>      REGISTER_FILTER (ANULLSINK,   anullsink,   asink);
>  
> +    REGISTER_FILTER (ALPHAEXTRACT, alphaextract, vf);
> +    REGISTER_FILTER (ALPHAMERGE,  alphamerge,  vf);
>      REGISTER_FILTER (ASS,         ass,         vf);
>      REGISTER_FILTER (BBOX,        bbox,        vf);
>      REGISTER_FILTER (BLACKDETECT, blackdetect, vf);
> diff --git a/libavfilter/vf_alphaextract.c b/libavfilter/vf_alphaextract.c
> new file mode 100644
> index 0000000..249c75d
> --- /dev/null
> +++ b/libavfilter/vf_alphaextract.c
> @@ -0,0 +1,108 @@
> +/*
> + * Copyright (c) 2012 Steven Robertson
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * simple channel-swapping filter to get at the alpha component
> + */
> +
> +#include <string.h>
> +
> +#include "libavutil/avassert.h"
> +#include "libavutil/pixfmt.h"
> +#include "avfilter.h"
> +#include "drawutils.h"
> +#include "formats.h"
> +#include "video.h"
> +
> +enum { Y, U, V, A };
> +
> +typedef struct {
> +    int is_packed_rgb;
> +    uint8_t rgba_map[4];
> +} AlphaExtractContext;
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    enum PixelFormat in_fmts[] = {
> +        PIX_FMT_YUVA444P, PIX_FMT_YUVA422P, PIX_FMT_YUVA420P,
> +        PIX_FMT_RGBA, PIX_FMT_BGRA, PIX_FMT_ARGB, PIX_FMT_ABGR,
> +        PIX_FMT_NONE
> +    };
> +    enum PixelFormat out_fmts[] = { PIX_FMT_GRAY8, PIX_FMT_NONE };
> +    ff_formats_ref(ff_make_format_list(in_fmts), &ctx->inputs[0]->out_formats);
> +    ff_formats_ref(ff_make_format_list(out_fmts), &ctx->outputs[0]->in_formats);
> +    return 0;
> +}
> +
> +static int config_input(AVFilterLink *link) {

Nit+: here and below:
static int config_input(AVFilterLink *link)
{
...

is favored (but I can fix it when committing, so no need to fix it
yourself if you don't care).

> +    AlphaExtractContext *extract = link->dst->priv;
> +    extract->is_packed_rgb =
> +        ff_fill_rgba_map(extract->rgba_map, link->format) >= 0;
> +    return 0;
> +}
> +
> +static void draw_slice(AVFilterLink *link, int y0, int h, int slice_dir)
> +{
> +    AlphaExtractContext *extract = link->dst->priv;
> +    AVFilterBufferRef *cur_buf = link->cur_buf;
> +    AVFilterBufferRef *out_buf = link->dst->outputs[0]->out_buf;
> +
> +    if (extract->is_packed_rgb) {
> +        int x, y, pin, pout;
> +        for (y = y0; y < (y0 + h); y++) {
> +            for (x = 0; x < out_buf->video->w; x++) {
> +                pin = y * cur_buf->linesize[0] + x * 4 + extract->rgba_map[A];
> +                pout = y * out_buf->linesize[0] + x;
> +                out_buf->data[0][pout] = cur_buf->data[0][pin];
> +            }
> +        }
> +    } else {
> +        const int linesize = cur_buf->linesize[A];
> +        av_assert0(linesize == out_buf->linesize[Y]);
> +
> +        memmove(out_buf->data[Y] + y0 * linesize,
> +                cur_buf->data[A] + y0 * linesize,
> +                linesize * h);
> +    }
> +    ff_draw_slice(link->dst->outputs[0], y0, h, slice_dir);
> +}
> +
> +AVFilter avfilter_vf_alphaextract = {
> +    .name           = "alphaextract",
> +    .description    = NULL_IF_CONFIG_SMALL("Extract an alpha channel as a "
> +                      "grayscale image component."),
> +    .priv_size      = sizeof(AlphaExtractContext),
> +    .query_formats  = query_formats,
> +
> +    .inputs    = (const AVFilterPad[]) {
> +        { .name             = "default",
> +          .type             = AVMEDIA_TYPE_VIDEO,
> +          .config_props     = config_input,
> +          .draw_slice       = draw_slice,
> +          .min_perms        = AV_PERM_READ },
> +        { .name = NULL }
> +    },
> +    .outputs   = (const AVFilterPad[]) {
> +      { .name               = "default",
> +        .type               = AVMEDIA_TYPE_VIDEO, },
> +      { .name = NULL }
> +    },
> +};
> diff --git a/libavfilter/vf_alphamerge.c b/libavfilter/vf_alphamerge.c
> new file mode 100644
> index 0000000..05f9f27
> --- /dev/null
> +++ b/libavfilter/vf_alphamerge.c
> @@ -0,0 +1,186 @@
> +/*
> + * Copyright (c) 2012 Steven Robertson
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * copy an alpha component from another video's luma
> + */
> +
> +#include <string.h>
> +
> +#include "libavutil/pixfmt.h"
> +#include "avfilter.h"
> +#include "bufferqueue.h"
> +#include "drawutils.h"
> +#include "formats.h"
> +#include "internal.h"
> +#include "video.h"
> +
> +enum { Y, U, V, A };
> +
> +typedef struct {
> +    int is_packed_rgb;
> +    uint8_t rgba_map[4];
> +    struct FFBufQueue queue_main;
> +    struct FFBufQueue queue_alpha;
> +} AlphaMergeContext;
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    AlphaMergeContext *merge = ctx->priv;
> +    ff_bufqueue_discard_all(&merge->queue_main);
> +    ff_bufqueue_discard_all(&merge->queue_alpha);
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    enum PixelFormat main_fmts[] = {
> +        PIX_FMT_YUVA444P, PIX_FMT_YUVA422P, PIX_FMT_YUVA420P,
> +        PIX_FMT_RGBA, PIX_FMT_BGRA, PIX_FMT_ARGB, PIX_FMT_ABGR,
> +        PIX_FMT_NONE
> +    };
> +    enum PixelFormat alpha_fmts[] = { PIX_FMT_GRAY8, PIX_FMT_NONE };
> +    AVFilterFormats *main_formats = ff_make_format_list(main_fmts);
> +    AVFilterFormats *alpha_formats = ff_make_format_list(alpha_fmts);
> +    ff_formats_ref(main_formats, &ctx->inputs[0]->out_formats);
> +    ff_formats_ref(alpha_formats, &ctx->inputs[1]->out_formats);
> +    ff_formats_ref(main_formats, &ctx->outputs[0]->in_formats);
> +    return 0;
> +}
> +
> +static int config_input_main(AVFilterLink *link) {
> +    AlphaMergeContext *merge = link->dst->priv;
> +    merge->is_packed_rgb =
> +        ff_fill_rgba_map(merge->rgba_map, link->format) >= 0;
> +    return 0;
> +}
> +

> +static int config_output(AVFilterLink *outlink) {
> +    AVFilterLink *inlink = outlink->src->inputs[0];
> +    outlink->w = inlink->w;
> +    outlink->h = inlink->h;
> +    outlink->time_base = inlink->time_base;
> +    outlink->sample_aspect_ratio = inlink->sample_aspect_ratio;
> +    outlink->frame_rate = inlink->frame_rate;
> +    return 0;
> +}

Is this required?

> +
> +static void start_frame(AVFilterLink *link, AVFilterBufferRef *picref) {}
> +static void draw_slice(AVFilterLink *link, int y, int h, int slice_dir) {}
> +
> +static void draw_frame(AVFilterContext *ctx,
> +                       AVFilterBufferRef *main_buf,
> +                       AVFilterBufferRef *alpha_buf)
> +{
> +    AlphaMergeContext *merge = ctx->priv;
> +    int h = main_buf->video->h;
> +
> +    if (merge->is_packed_rgb) {
> +        int x, y, pin, pout;
> +        for (y = 0; y < h && y < alpha_buf->video->h; y++) {
> +            for (x = 0; x < main_buf->video->w && x < alpha_buf->video->w; x++) {

> +                pin = y * alpha_buf->linesize[0] + x;
> +                pout = y * main_buf->linesize[0] + x * 4 + merge->rgba_map[A];
> +                main_buf->data[0][pout] = alpha_buf->data[0][pin];

Nit: pin and pout are usually pointers rather than offsets, so you
could do:
                uint8_t *pin, *pout;
                ...
                pin = alpha_buf->data[0] + y * alpha_buf->linesize[0] + x;
                pout = main_buf->data[0] + y * main_buf->linesize[0] + x * 4 + merge->rgba_map[A];
                *pout = *pin;

or 
                main_buf->data[0][y * main_buf->linesize[0] + x * 4 + merge->rgba_map[A]] =
                                alpha_buf->data[0][y * alpha_buf->linesize[0] + x] = 
or:
        offin, offout;
        ...

> +            }
> +        }
> +    } else {
> +        int y;
> +        int main_line = main_buf->linesize[A];
> +        int alpha_line = alpha_buf->linesize[Y];
> +
> +        for (y = 0; y < h && y < alpha_buf->video->h; y++) {
> +            memmove(main_buf->data[A] + y * main_line,
> +                    alpha_buf->data[Y] + y * alpha_line,
> +                    FFMIN(main_line, alpha_line));
> +        }
> +    }
> +    ff_draw_slice(ctx->outputs[0], 0, h, 1);
> +}
> +

> +static void end_frame(AVFilterLink *link) {

nit: link->inlink, helps readability/consistency

[...]
> diff --git a/tests/lavfi-regression.sh b/tests/lavfi-regression.sh
> index dd5e2da..224fc8f 100755
> --- a/tests/lavfi-regression.sh
> +++ b/tests/lavfi-regression.sh
> @@ -13,21 +13,25 @@ eval do_$test=y
>  
>  do_video_filter() {
>      label=$1
> -    filters=$2
> +    filters="$2"
>      shift 2
>      printf '%-20s' $label
>      run_avconv $DEC_OPTS -f image2 -vcodec pgmyuv -i $raw_src    \
>          $ENC_OPTS -vf "$filters" -vcodec rawvideo $* -f nut md5:
>  }
>  
> -do_lavfi() {
> -    vfilters="slicify=random,$2"
> +do_lavfi_plain() {
> +    vfilters="$2"
>  
>      if [ $test = $1 ] ; then
>          do_video_filter $test "$vfilters"
>      fi
>  }
>
> +do_lavfi() {
> +    do_lavfi_plain $1 "slicify=random,$2"
> +}
> +
>  do_lavfi_colormatrix() {
>      do_lavfi "${1}1" "$1=$4:$5,$1=$5:$3,$1=$3:$4,$1=$4:$3,$1=$3:$5,$1=$5:$2"
>      do_lavfi "${1}2" "$1=$2:$3,$1=$3:$2,$1=$2:$4,$1=$4:$2,$1=$2:$5,$1=$5:$4"
> @@ -60,6 +64,11 @@ do_lavfi "vflip"              "vflip"
>  do_lavfi "vflip_crop"         "vflip,crop=iw-100:ih-100:100:100"
>  do_lavfi "vflip_vflip"        "vflip,vflip"
>  

> +do_lavfi_plain "alphamerge_rgb"     "[in]slicify=random,format=bgra,split[o1][o2];[o1][o2]alphamerge[out]"
> +do_lavfi_plain "alphamerge_yuv"     "[in]slicify=random,format=yuv420p,split[o1][o2];[o1][o2]alphamerge[out]"

> +do_lavfi_plain "alphaextract_rgb"   "[in]slicify=random,format=bgra,split[o1][o2];[o1][o2]alphamerge,slicify=random,split[o3][o4];[o4]alphaextract[alpha];[o3][alpha]alphamerge[out]"
> +do_lavfi_plain "alphaextract_yuv"   "[in]slicify=random,format=yuv420p,split[o1][o2];[o1][o2]alphamerge,slicify=random,split[o3][o4];[o4]alphaextract[alpha];[o3][alpha]alphamerge[out]"

The first part seems a duplicate of the first test, why not a simple:

do_lavfi_plain "alphaextract_rgb"   "[in]slicify=random,split[o3][o4];[o4]alphaextract[alpha];[o3][alpha]alphamerge[out]"
?

[...]

Overall nice work.
-- 
FFmpeg = Fabulous and Fundamentalist Merciless Patchable Esoteric Guru