[FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added

Nicolas George george at nsup.org
Sun Dec 30 15:02:11 EET 2018


Uwe Freese (2018-12-29):
> The downside is that the term "power" (or "strength") would be more generic
> and would probably be better reusable for other modes.
> 
> What do you think? Is "exponent" now ok, or should I change it back to
> "power", "strength" or something alike?

I would say: do not bother. This can be decided when a new mode is
added, and options can have several names.

By the way, you would not be interested in developing a logo-detection
filter, per chance? Something that detects is a certain rectangle is
likely to contain a logo and computes its bounding box.

> Again, I would be glad about comments and reviews. - Let me know what should
> be changed and I'll create a new patch in some days.

Here is a more in-depth review from me:

> >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001
> From: breaker27 <mail at uwe-freese.de>
> Date: Wed, 26 Dec 2018 18:16:48 +0100
> Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode
> 
> ---
>  doc/filters.texi        |  28 +++++++
>  libavfilter/vf_delogo.c | 189 +++++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 208 insertions(+), 9 deletions(-)
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 65ce25bc18..792560ad79 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the rectangle (added to
>  deprecated, setting higher values should no longer be necessary and
>  is not recommended.
>  
> + at item mode
> +Specify the interpolation mode used to remove the logo.
> +It accepts the following values:
> + at table @option
> + at item xy
> +Only the border pixels in straight x and y direction are considered
> +to replace any logo pixel. This mode tends to flicker and to show
> +horizontal and vertical lines.
> + at item uglarm
> +Consider the whole border for every logo pixel to replace. This mode
> +uses the distance raised to the power of the given @var{exponent} as
> +weight that decides to which amount every border pixel is taken into
> +account. This results in a more blurred area, which tends to be less
> +distracting. The uglarm mode was first implemented in VirtualDub's
> +LogoAway filter and is also known from ffdshow and is the
> +abbreviation for "Uwe's Great LogoAway Remove Mode".
> + at end table
> +The default value is @code{xy}.
> +
> + at item exponent
> +Specify the exponent used to calculate the weight out of the
> +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
> +The value @code{0} results in a logo area which has
> +the same average color everywhere. The higher the value, the more
> +relevant a near border pixel will get, meaning that the borders of
> +the logo area are more similar to the surrounding pixels. The default
> +value is @code{3}.
> +
>  @item show
>  When set to 1, a green rectangle is drawn on the screen to simplify
>  finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
> diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
> index 065d093641..dcb0366e63 100644
> --- a/libavfilter/vf_delogo.c
> +++ b/libavfilter/vf_delogo.c
> @@ -2,6 +2,7 @@
>   * Copyright (c) 2002 Jindrich Makovicka <makovick at gmail.com>
>   * Copyright (c) 2011 Stefano Sabatini
>   * Copyright (c) 2013, 2015 Jean Delvare <jdelvare at suse.com>
> + * Copyright (c) 2018 Uwe Freese <mail at uwe-freese.de>
>   *
>   * This file is part of FFmpeg.
>   *
> @@ -25,12 +26,16 @@
>   * A very simple tv station logo remover
>   * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
>   * the algorithm was later improved.
> + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual
> + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's
> + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
>   */
>  
>  #include "libavutil/common.h"
>  #include "libavutil/imgutils.h"
>  #include "libavutil/opt.h"
>  #include "libavutil/pixdesc.h"

> +#include "libavutil/avassert.h"

We try to keep the includes in alphabetical order.

>  #include "avfilter.h"
>  #include "formats.h"
>  #include "internal.h"
> @@ -50,6 +55,10 @@
>   * @param logo_w width of the logo
>   * @param logo_h height of the logo
>   * @param band   the size of the band around the processed area
> + * @param *uglarmtable      pointer to weight table in UGLARM interpolation mode,
> + *                          zero when x-y mode is used
> + * @param *uglarmweightsum  pointer to weight sum table in UGLARM interpolation mode,
> + *                          zero when x-y mode is used
>   * @param show   show a rectangle around the processed area, useful for
>   *               parameters tweaking
>   * @param direct if non-zero perform in-place processing
> @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>                           uint8_t *src, int src_linesize,
>                           int w, int h, AVRational sar,
>                           int logo_x, int logo_y, int logo_w, int logo_h,
> -                         unsigned int band, int show, int direct)
> +                         unsigned int band, double *uglarmtable,
> +                         double *uglarmweightsum, int show, int direct)
>  {
>      int x, y;
>      uint64_t interp, weightl, weightr, weightt, weightb, weight;
> @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>      dst += (logo_y1 + 1) * dst_linesize;
>      src += (logo_y1 + 1) * src_linesize;
>  
> +    if (!uglarmtable) {
>      for (y = logo_y1+1; y < logo_y2; y++) {
>          left_sample = topleft[src_linesize*(y-logo_y1)]   +
>                        topleft[src_linesize*(y-logo_y1-1)] +
> @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int dst_linesize,
>          dst += dst_linesize;
>          src += src_linesize;
>      }
> +    } else {
> +        int bx, by;
> +        double interpd;
> +

> +        for (y = logo_y1 + 1; y < logo_y2; y++) {
> +            for (x = logo_x1 + 1,
> +                xdst = dst + logo_x1 + 1,
> +                xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) {

I think a line break after the semicolons would make this easier to
understand.

Also, since x and y always used subtracted to logo_x1 or logo_y1, I
think changing the bounds of the loop would make the rest of the code
more readable:

	for (x = 1; x < logo_w_minus_one; x++)

> +
> +                if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 ||
> +                            x == logo_x1 + 1 || x == logo_x2 - 1)) {

Nit: the alignment is off.

> +                    *xdst = 0;
> +                    continue;
> +                }
> +
> +                interpd = 0;
> +

> +                for (bx = 0; bx < logo_w; bx++) {
> +                    interpd += topleft[bx] *
> +                        uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) * (logo_w - 1)];
> +                    interpd += botleft[bx] *
> +                        uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - logo_y1) - 1) * (logo_w - 1)];
> +                }

This can be optimized, and since it is the inner loop of the filter, I
think it is worth it. You can declare a pointer that will stay the same
for the whole y loop:

	double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1);

and then use it in that loop:

	interpd += ... * xweight[abs(bx - (x - logo_x1))];

It avoids a lot of multiplications.

> +
> +                for (by = 1; by < logo_h - 1; by++) {
> +                    interpd += topleft[by * src_linesize] *
> +                        uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) * (logo_w - 1)];
> +                    interpd += topleft[by * src_linesize + (logo_w - 1)] *
> +                        uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y - logo_y1)) * (logo_w - 1)];
> +                }

This one is more tricky to optimize, because of the abs() within the
multiplication, but it can be done by splitting the loop in two:

	for (by = 1; by < y; by++) {
	    interpd += ... * yweight[x - logo_x1];
	    yweight -= logo_w_minus_one;
	}
	for (; by < logo_h_minus_one; by++) {
	    interpd += ... * yweight[x - logo_x1];
	    yweight += logo_w_minus_one;
	}
	av_assert2(yweight == the_correct_value);

In fact, I think even the x loop would be a little more readable if it
was split in two like that.

> +
> +                interp = (uint64_t)(interpd /
> +                    uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * (logo_w - 2)]);

The cast to uint64_t seems suspicious. Can you explain?

> +                *xdst = interp;
> +            }
> +
> +            dst += dst_linesize;
> +            src += src_linesize;
> +        }
> +    }
> +}
> +
> +/**
> + * Calculate the lookup tables to be used in UGLARM interpolation mode.
> + *
> + * @param *uglarmtable      Pointer to table containing weights for each possible
> + *                          diagonal distance between a border pixel and an inner
> + *                          logo pixel.
> + * @param *uglarmweightsum  Pointer to a table containing the weight sum to divide
> + *                          by for each pixel within the logo area.
> + * @param sar               The sar to take into account when calculating lookup
> + *                          tables.
> + * @param logo_w            width of the logo
> + * @param logo_h            height of the logo
> + * @param exponent          exponent used in uglarm interpolation
> + */
> +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum,
> +                               AVRational sar, int logo_w, int logo_h,
> +                               float exponent)
> +{

> +    double aspect = (double)sar.num / sar.den;

Tiny optimization:

	double aspect2 = aspect * aspect;

for use later.

> +    int x, y;


> +
> +    /* uglarmtable will contain a weight for each possible diagonal distance
> +     * between a border pixel and an inner logo pixel. The maximum distance in
> +     * each direction between border and an inner pixel can be logo_w - 1. The
> +     * weight of a border pixel which is x,y pixels away is stored at position
> +     * x + y * (logo_w - 1). */
> +    for (y = 0; y < logo_h - 1; y++)
> +        for (x = 0; x < logo_w - 1; x++) {
> +            if (x + y != 0) {

> +                double d = pow(sqrt(x * x * aspect * aspect + y * y), exponent);

	pow(sqrt(x * x * aspect2 + y * y), exponent / 2);

> +                uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
> +            } else {
> +                uglarmtable[x + y * (logo_w - 1)] = 1.0;
> +            }
> +        }
> +
> +    /* uglarmweightsum will contain the sum of all weights which is used when
> +     * an inner pixel of the logo at position x,y is calculated out of the
> +     * border pixels. The aggregated value has to be divided by that. The value
> +     * to use for the inner 1-based logo position x,y is stored at
> +     * (x - 1) + (y - 1) * (logo_w - 2). */
> +    for (y = 1; y < logo_h - 1; y++)
> +        for (x = 1; x < logo_w - 1; x++) {
> +            double weightsum = 0;
> +

> +            for (int bx = 0; bx < logo_w; bx++) {

Our coding style forbids declaring variables on the fly like that. Same
below.

> +                /* top border */
> +                weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)];
> +                /* bottom border */
> +                weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * (logo_w - 1)];
> +            }
> +
> +            for (int by = 1; by < logo_h - 1; by++) {
> +                /* left border */
> +                weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)];
> +                /* right border */
> +                weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * (logo_w - 1)];
> +            }
> +
> +            uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum;
> +        }
>  }
>  
> +enum mode {
> +    MODE_XY,
> +    MODE_UGLARM,
> +    MODE_NB
> +};
> +

> +#define MAX_PLANES 10

You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the
consistency would be guaranteed. Well, that syntax is not valid, but it
can be expressed:

#define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp)

But I have a better suggestion:

#define MAX_SUB 2

	double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]

and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the
table will be computed only once for Y and A and once for U and V.

The assert is still needed with that solution, though.

> +
>  typedef struct DelogoContext {
>      const AVClass *class;
> -    int x, y, w, h, band, show;
> -}  DelogoContext;
> +    int x, y, w, h, band, mode, show;
> +    float exponent;
> +    double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES];
> +} DelogoContext;
>  
>  #define OFFSET(x) offsetof(DelogoContext, x)
>  #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
> @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= {
>      { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
>      { "t",    "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { .i64 =  0 },  0, INT_MAX, FLAGS },
>  #endif
> +    { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
> +        { "xy",     "use pixels in straight x and y direction",  OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY },     0, 0, FLAGS, "mode" },
> +        { "uglarm", "UGLARM mode, use full border",              OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" },
> +    { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
>      { "show", "show delogo area",          OFFSET(show), AV_OPT_TYPE_BOOL,{ .i64 =  0 },  0, 1,       FLAGS },
>      { NULL }
>  };
> @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx)
>  #else
>      s->band = 1;
>  #endif
> -    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n",
> -           s->x, s->y, s->w, s->h, s->band, s->show);
> +    av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d exponent:%f show:%d\n",
> +           s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show);
>  
>      s->w += s->band*2;
>      s->h += s->band*2;
> @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx)
>      return 0;
>  }
>  
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    DelogoContext *s = ctx->priv;
> +

> +    if (s->mode == MODE_UGLARM) {

No need to test, we can free anyway.

> +        for (int plane = 0; plane < MAX_PLANES; plane++) {

Forbidden variable declaration.

> +            av_freep(&s->uglarmtable[plane]);
> +            av_freep(&s->uglarmweightsum[plane]);
> +        }
> +    }
> +}
> +
>  static int config_input(AVFilterLink *inlink)
>  {
>      DelogoContext *s = inlink->dst->priv;
> @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
>      if (!sar.num)
>          sar.num = sar.den = 1;
>  
> +    if (s->mode == MODE_UGLARM)
> +        av_assert0(desc->nb_components <= MAX_PLANES);
> +
>      for (plane = 0; plane < desc->nb_components; plane++) {
>          int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
>          int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
>  

> +        /* Up and left borders were rounded down, inject lost bits
> +        * into width and height to avoid error accumulation */

Not: the alignment is off.

> +        int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub);
> +        int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub);
> +
> +        /* Init lookup tables once */
> +        if (s->mode == MODE_UGLARM) {
> +            if (!s->uglarmtable[plane]) {
> +                s->uglarmtable[plane] =

> +                    (double*)av_malloc((logo_w - 1) * (logo_h - 1) * sizeof(double));

No casting of malloc returns in C, this is a c++ thing. Also, use
av_malloc_array().

> +
> +                if (!s->uglarmtable[plane]) {
> +                    return AVERROR(ENOMEM);
> +                }

The braces could be dropped.

> +
> +                s->uglarmweightsum[plane] =
> +                    (double*)av_malloc((logo_w - 2) * (logo_h - 2) * sizeof(double));
> +
> +                if (!s->uglarmweightsum[plane]) {
> +                    return AVERROR(ENOMEM);
> +                }

Same as above, of course.

> +
> +                calc_uglarm_tables(s->uglarmtable[plane],
> +                                s->uglarmweightsum[plane],
> +                                sar, logo_w, logo_h, s->exponent);
> +            }
> +        }
> +

I think this whole block could and should be moved into config_input().

>          apply_delogo(out->data[plane], out->linesize[plane],
>                       in ->data[plane], in ->linesize[plane],
>                       AV_CEIL_RSHIFT(inlink->w, hsub),
>                       AV_CEIL_RSHIFT(inlink->h, vsub),
>                       sar, s->x>>hsub, s->y>>vsub,
> -                     /* Up and left borders were rounded down, inject lost bits
> -                      * into width and height to avoid error accumulation */
> -                     AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub),
> -                     AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub),
> +                     logo_w, logo_h,
>                       s->band>>FFMIN(hsub, vsub),
> +                     s->uglarmtable[plane],
> +                     s->uglarmweightsum[plane],
>                       s->show, direct);
>      }
>  
> @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = {
>      .priv_size     = sizeof(DelogoContext),
>      .priv_class    = &delogo_class,
>      .init          = init,
> +    .uninit        = uninit,
>      .query_formats = query_formats,
>      .inputs        = avfilter_vf_delogo_inputs,
>      .outputs       = avfilter_vf_delogo_outputs,

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20181230/aa2afa1a/attachment.sig>


More information about the ffmpeg-devel mailing list