[FFmpeg-devel] [PATCH] delogo filter: new "uglarm" interpolation mode added
Paul B Mahol
onemda at gmail.com
Sun Dec 30 15:53:03 EET 2018
On 12/30/18, Nicolas George <george at nsup.org> wrote:
> Uwe Freese (2018-12-29):
>> The downside is that the term "power" (or "strength") would be more
>> generic
>> and would probably be better reusable for other modes.
>>
>> What do you think? Is "exponent" now ok, or should I change it back to
>> "power", "strength" or something alike?
>
> I would say: do not bother. This can be decided when a new mode is
> added, and options can have several names.
>
> By the way, you would not be interested in developing a logo-detection
> filter, per chance? Something that detects is a certain rectangle is
> likely to contain a logo and computes its bounding box.
>
>> Again, I would be glad about comments and reviews. - Let me know what
>> should
>> be changed and I'll create a new patch in some days.
>
> Here is a more in-depth review from me:
>
>> >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001
>> From: breaker27 <mail at uwe-freese.de>
>> Date: Wed, 26 Dec 2018 18:16:48 +0100
>> Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode
>>
>> ---
>> doc/filters.texi | 28 +++++++
>> libavfilter/vf_delogo.c | 189
>> +++++++++++++++++++++++++++++++++++++++++++++---
>> 2 files changed, 208 insertions(+), 9 deletions(-)
>>
>> diff --git a/doc/filters.texi b/doc/filters.texi
>> index 65ce25bc18..792560ad79 100644
>> --- a/doc/filters.texi
>> +++ b/doc/filters.texi
>> @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the
>> rectangle (added to
>> deprecated, setting higher values should no longer be necessary and
>> is not recommended.
>>
>> + at item mode
>> +Specify the interpolation mode used to remove the logo.
>> +It accepts the following values:
>> + at table @option
>> + at item xy
>> +Only the border pixels in straight x and y direction are considered
>> +to replace any logo pixel. This mode tends to flicker and to show
>> +horizontal and vertical lines.
>> + at item uglarm
>> +Consider the whole border for every logo pixel to replace. This mode
>> +uses the distance raised to the power of the given @var{exponent} as
>> +weight that decides to which amount every border pixel is taken into
>> +account. This results in a more blurred area, which tends to be less
>> +distracting. The uglarm mode was first implemented in VirtualDub's
>> +LogoAway filter and is also known from ffdshow and is the
>> +abbreviation for "Uwe's Great LogoAway Remove Mode".
>> + at end table
>> +The default value is @code{xy}.
>> +
>> + at item exponent
>> +Specify the exponent used to calculate the weight out of the
>> +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}).
>> +The value @code{0} results in a logo area which has
>> +the same average color everywhere. The higher the value, the more
>> +relevant a near border pixel will get, meaning that the borders of
>> +the logo area are more similar to the surrounding pixels. The default
>> +value is @code{3}.
>> +
>> @item show
>> When set to 1, a green rectangle is drawn on the screen to simplify
>> finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters.
>> diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c
>> index 065d093641..dcb0366e63 100644
>> --- a/libavfilter/vf_delogo.c
>> +++ b/libavfilter/vf_delogo.c
>> @@ -2,6 +2,7 @@
>> * Copyright (c) 2002 Jindrich Makovicka <makovick at gmail.com>
>> * Copyright (c) 2011 Stefano Sabatini
>> * Copyright (c) 2013, 2015 Jean Delvare <jdelvare at suse.com>
>> + * Copyright (c) 2018 Uwe Freese <mail at uwe-freese.de>
>> *
>> * This file is part of FFmpeg.
>> *
>> @@ -25,12 +26,16 @@
>> * A very simple tv station logo remover
>> * Originally imported from MPlayer libmpcodecs/vf_delogo.c,
>> * the algorithm was later improved.
>> + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for
>> Virtual
>> + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into
>> ffdshow's
>> + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg.
>> */
>>
>> #include "libavutil/common.h"
>> #include "libavutil/imgutils.h"
>> #include "libavutil/opt.h"
>> #include "libavutil/pixdesc.h"
>
>> +#include "libavutil/avassert.h"
>
> We try to keep the includes in alphabetical order.
>
>> #include "avfilter.h"
>> #include "formats.h"
>> #include "internal.h"
>> @@ -50,6 +55,10 @@
>> * @param logo_w width of the logo
>> * @param logo_h height of the logo
>> * @param band the size of the band around the processed area
>> + * @param *uglarmtable pointer to weight table in UGLARM
>> interpolation mode,
>> + * zero when x-y mode is used
>> + * @param *uglarmweightsum pointer to weight sum table in UGLARM
>> interpolation mode,
>> + * zero when x-y mode is used
>> * @param show show a rectangle around the processed area, useful for
>> * parameters tweaking
>> * @param direct if non-zero perform in-place processing
>> @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>> uint8_t *src, int src_linesize,
>> int w, int h, AVRational sar,
>> int logo_x, int logo_y, int logo_w, int logo_h,
>> - unsigned int band, int show, int direct)
>> + unsigned int band, double *uglarmtable,
>> + double *uglarmweightsum, int show, int direct)
>> {
>> int x, y;
>> uint64_t interp, weightl, weightr, weightt, weightb, weight;
>> @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>> dst += (logo_y1 + 1) * dst_linesize;
>> src += (logo_y1 + 1) * src_linesize;
>>
>> + if (!uglarmtable) {
>> for (y = logo_y1+1; y < logo_y2; y++) {
>> left_sample = topleft[src_linesize*(y-logo_y1)] +
>> topleft[src_linesize*(y-logo_y1-1)] +
>> @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int
>> dst_linesize,
>> dst += dst_linesize;
>> src += src_linesize;
>> }
>> + } else {
>> + int bx, by;
>> + double interpd;
>> +
>
>> + for (y = logo_y1 + 1; y < logo_y2; y++) {
>> + for (x = logo_x1 + 1,
>> + xdst = dst + logo_x1 + 1,
>> + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++,
>> xsrc++) {
>
> I think a line break after the semicolons would make this easier to
> understand.
>
> Also, since x and y always used subtracted to logo_x1 or logo_y1, I
> think changing the bounds of the loop would make the rest of the code
> more readable:
>
> for (x = 1; x < logo_w_minus_one; x++)
>
>> +
>> + if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 ||
>> + x == logo_x1 + 1 || x == logo_x2 - 1)) {
>
> Nit: the alignment is off.
>
>> + *xdst = 0;
>> + continue;
>> + }
>> +
>> + interpd = 0;
>> +
>
>> + for (bx = 0; bx < logo_w; bx++) {
>> + interpd += topleft[bx] *
>> + uglarmtable[abs(bx - (x - logo_x1)) + (y -
>> logo_y1) * (logo_w - 1)];
>> + interpd += botleft[bx] *
>> + uglarmtable[abs(bx - (x - logo_x1)) + (logo_h -
>> (y - logo_y1) - 1) * (logo_w - 1)];
>> + }
>
> This can be optimized, and since it is the inner loop of the filter, I
> think it is worth it. You can declare a pointer that will stay the same
> for the whole y loop:
>
> double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1);
>
> and then use it in that loop:
>
> interpd += ... * xweight[abs(bx - (x - logo_x1))];
>
> It avoids a lot of multiplications.
>
>> +
>> + for (by = 1; by < logo_h - 1; by++) {
>> + interpd += topleft[by * src_linesize] *
>> + uglarmtable[(x - logo_x1) + abs(by - (y -
>> logo_y1)) * (logo_w - 1)];
>> + interpd += topleft[by * src_linesize + (logo_w - 1)]
>> *
>> + uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by -
>> (y - logo_y1)) * (logo_w - 1)];
>> + }
>
> This one is more tricky to optimize, because of the abs() within the
> multiplication, but it can be done by splitting the loop in two:
>
> for (by = 1; by < y; by++) {
> interpd += ... * yweight[x - logo_x1];
> yweight -= logo_w_minus_one;
> }
> for (; by < logo_h_minus_one; by++) {
> interpd += ... * yweight[x - logo_x1];
> yweight += logo_w_minus_one;
> }
> av_assert2(yweight == the_correct_value);
>
> In fact, I think even the x loop would be a little more readable if it
> was split in two like that.
>
>> +
>> + interp = (uint64_t)(interpd /
>> + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1)
>> * (logo_w - 2)]);
>
> The cast to uint64_t seems suspicious. Can you explain?
>
>> + *xdst = interp;
>> + }
>> +
>> + dst += dst_linesize;
>> + src += src_linesize;
>> + }
>> + }
>> +}
>> +
>> +/**
>> + * Calculate the lookup tables to be used in UGLARM interpolation mode.
>> + *
>> + * @param *uglarmtable Pointer to table containing weights for each
>> possible
>> + * diagonal distance between a border pixel and
>> an inner
>> + * logo pixel.
>> + * @param *uglarmweightsum Pointer to a table containing the weight sum
>> to divide
>> + * by for each pixel within the logo area.
>> + * @param sar The sar to take into account when calculating
>> lookup
>> + * tables.
>> + * @param logo_w width of the logo
>> + * @param logo_h height of the logo
>> + * @param exponent exponent used in uglarm interpolation
>> + */
>> +static void calc_uglarm_tables(double *uglarmtable, double
>> *uglarmweightsum,
>> + AVRational sar, int logo_w, int logo_h,
>> + float exponent)
>> +{
>
>> + double aspect = (double)sar.num / sar.den;
>
> Tiny optimization:
>
> double aspect2 = aspect * aspect;
>
> for use later.
>
>> + int x, y;
>
>
>> +
>> + /* uglarmtable will contain a weight for each possible diagonal
>> distance
>> + * between a border pixel and an inner logo pixel. The maximum
>> distance in
>> + * each direction between border and an inner pixel can be logo_w -
>> 1. The
>> + * weight of a border pixel which is x,y pixels away is stored at
>> position
>> + * x + y * (logo_w - 1). */
>> + for (y = 0; y < logo_h - 1; y++)
>> + for (x = 0; x < logo_w - 1; x++) {
>> + if (x + y != 0) {
>
>> + double d = pow(sqrt(x * x * aspect * aspect + y * y),
>> exponent);
>
> pow(sqrt(x * x * aspect2 + y * y), exponent / 2);
>
>> + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d;
>> + } else {
>> + uglarmtable[x + y * (logo_w - 1)] = 1.0;
>> + }
>> + }
>> +
>> + /* uglarmweightsum will contain the sum of all weights which is used
>> when
>> + * an inner pixel of the logo at position x,y is calculated out of
>> the
>> + * border pixels. The aggregated value has to be divided by that. The
>> value
>> + * to use for the inner 1-based logo position x,y is stored at
>> + * (x - 1) + (y - 1) * (logo_w - 2). */
>> + for (y = 1; y < logo_h - 1; y++)
>> + for (x = 1; x < logo_w - 1; x++) {
>> + double weightsum = 0;
>> +
>
>> + for (int bx = 0; bx < logo_w; bx++) {
>
> Our coding style forbids declaring variables on the fly like that. Same
> below.
This is simply not more true.
Look at developer guidelines for this project.
>
>> + /* top border */
>> + weightsum += uglarmtable[abs(bx - x) + y * (logo_w -
>> 1)];
>> + /* bottom border */
>> + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) *
>> (logo_w - 1)];
>> + }
>> +
>> + for (int by = 1; by < logo_h - 1; by++) {
>> + /* left border */
>> + weightsum += uglarmtable[x + abs(by - y) * (logo_w -
>> 1)];
>> + /* right border */
>> + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) *
>> (logo_w - 1)];
>> + }
>> +
>> + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] =
>> weightsum;
>> + }
>> }
>>
>> +enum mode {
>> + MODE_XY,
>> + MODE_UGLARM,
>> + MODE_NB
>> +};
>> +
>
>> +#define MAX_PLANES 10
>
> You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the
> consistency would be guaranteed. Well, that syntax is not valid, but it
> can be expressed:
>
> #define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp)
>
> But I have a better suggestion:
>
> #define MAX_SUB 2
>
> double uglarmtable[MAX_SUB + 1][MAX_SUB + 1]
>
> and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the
> table will be computed only once for Y and A and once for U and V.
>
> The assert is still needed with that solution, though.
>
>> +
>> typedef struct DelogoContext {
>> const AVClass *class;
>> - int x, y, w, h, band, show;
>> -} DelogoContext;
>> + int x, y, w, h, band, mode, show;
>> + float exponent;
>> + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES];
>> +} DelogoContext;
>>
>> #define OFFSET(x) offsetof(DelogoContext, x)
>> #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM
>> @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= {
>> { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT,
>> { .i64 = 0 }, 0, INT_MAX, FLAGS },
>> { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT,
>> { .i64 = 0 }, 0, INT_MAX, FLAGS },
>> #endif
>> + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT,
>> { .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" },
>> + { "xy", "use pixels in straight x and y direction",
>> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS,
>> "mode" },
>> + { "uglarm", "UGLARM mode, use full border",
>> OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS,
>> "mode" },
>> + { "exponent","exponent used for UGLARM interpolation",
>> OFFSET(exponent), AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS },
>> { "show", "show delogo area", OFFSET(show),
>> AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, FLAGS },
>> { NULL }
>> };
>> @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx)
>> #else
>> s->band = 1;
>> #endif
>> - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d
>> show:%d\n",
>> - s->x, s->y, s->w, s->h, s->band, s->show);
>> + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d
>> exponent:%f show:%d\n",
>> + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent,
>> s->show);
>>
>> s->w += s->band*2;
>> s->h += s->band*2;
>> @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx)
>> return 0;
>> }
>>
>> +static av_cold void uninit(AVFilterContext *ctx)
>> +{
>> + DelogoContext *s = ctx->priv;
>> +
>
>> + if (s->mode == MODE_UGLARM) {
>
> No need to test, we can free anyway.
>
>> + for (int plane = 0; plane < MAX_PLANES; plane++) {
>
> Forbidden variable declaration.
>
>> + av_freep(&s->uglarmtable[plane]);
>> + av_freep(&s->uglarmweightsum[plane]);
>> + }
>> + }
>> +}
>> +
>> static int config_input(AVFilterLink *inlink)
>> {
>> DelogoContext *s = inlink->dst->priv;
>> @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink,
>> AVFrame *in)
>> if (!sar.num)
>> sar.num = sar.den = 1;
>>
>> + if (s->mode == MODE_UGLARM)
>> + av_assert0(desc->nb_components <= MAX_PLANES);
>> +
>> for (plane = 0; plane < desc->nb_components; plane++) {
>> int hsub = plane == 1 || plane == 2 ? hsub0 : 0;
>> int vsub = plane == 1 || plane == 2 ? vsub0 : 0;
>>
>
>> + /* Up and left borders were rounded down, inject lost bits
>> + * into width and height to avoid error accumulation */
>
> Not: the alignment is off.
>
>> + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)),
>> hsub);
>> + int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)),
>> vsub);
>> +
>> + /* Init lookup tables once */
>> + if (s->mode == MODE_UGLARM) {
>> + if (!s->uglarmtable[plane]) {
>> + s->uglarmtable[plane] =
>
>> + (double*)av_malloc((logo_w - 1) * (logo_h - 1) *
>> sizeof(double));
>
> No casting of malloc returns in C, this is a c++ thing. Also, use
> av_malloc_array().
>
>> +
>> + if (!s->uglarmtable[plane]) {
>> + return AVERROR(ENOMEM);
>> + }
>
> The braces could be dropped.
>
>> +
>> + s->uglarmweightsum[plane] =
>> + (double*)av_malloc((logo_w - 2) * (logo_h - 2) *
>> sizeof(double));
>> +
>> + if (!s->uglarmweightsum[plane]) {
>> + return AVERROR(ENOMEM);
>> + }
>
> Same as above, of course.
>
>> +
>> + calc_uglarm_tables(s->uglarmtable[plane],
>> + s->uglarmweightsum[plane],
>> + sar, logo_w, logo_h, s->exponent);
>> + }
>> + }
>> +
>
> I think this whole block could and should be moved into config_input().
>
>> apply_delogo(out->data[plane], out->linesize[plane],
>> in ->data[plane], in ->linesize[plane],
>> AV_CEIL_RSHIFT(inlink->w, hsub),
>> AV_CEIL_RSHIFT(inlink->h, vsub),
>> sar, s->x>>hsub, s->y>>vsub,
>> - /* Up and left borders were rounded down, inject
>> lost bits
>> - * into width and height to avoid error accumulation
>> */
>> - AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)),
>> hsub),
>> - AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)),
>> vsub),
>> + logo_w, logo_h,
>> s->band>>FFMIN(hsub, vsub),
>> + s->uglarmtable[plane],
>> + s->uglarmweightsum[plane],
>> s->show, direct);
>> }
>>
>> @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = {
>> .priv_size = sizeof(DelogoContext),
>> .priv_class = &delogo_class,
>> .init = init,
>> + .uninit = uninit,
>> .query_formats = query_formats,
>> .inputs = avfilter_vf_delogo_inputs,
>> .outputs = avfilter_vf_delogo_outputs,
>
> Regards,
>
> --
> Nicolas George
>
More information about the ffmpeg-devel
mailing list