[FFmpeg-devel] [PATCH] libavfilter: created a new filter that obtains the average peak signal-to-noise ratio (PSNR) of two input video files in YUV format.

Stefano Sabatini stefano.sabatini-lala at poste.it
Tue Jun 21 16:01:36 CEST 2011


On date Tuesday 2011-06-21 14:44:50 +0200, Roger Pau Monné encoded:
> Hello,
> 
> Sorry for the delay, I'm really busy these weeks, I will like to
> implement the option to store the results in a qpsnr style, but right
> now I don't have time, let's see if I can do it in a week or two. I've
> applied the suggested changes, thanks Stefano.
> 
> Regards, Roger.
> 
> 2011/6/10 Mark Himsley <mark at mdsh.com>:
> > On 10/06/11 00:18, Stefano Sabatini wrote:
> >>
> >> On date Tuesday 2011-06-07 14:03:36 +0200, Roger Pau Monné encoded:
> >
> >>> +    switch (inlink->format) {
> >>> +    case PIX_FMT_YUV410P:
> >>> +    case PIX_FMT_YUV411P:
> >>> +    case PIX_FMT_YUV420P:
> >>> +    case PIX_FMT_YUV422P:
> >>> +    case PIX_FMT_YUV440P:
> >>> +    case PIX_FMT_YUV444P:
> >>> +    case PIX_FMT_YUVA420P:
> >>
> >>> +        psnr->max[0] = 235;
> >>
> >> psnr->max[3] = 255;
> >>
> >> at least I suppose this is the max alpha value in
> >> psnr->YUVA420P, yes I forgot this in the lut filter
> >
> > It is not. Alpha is always 0 - 255 (headroom is not required in an alpha
> > signal, you can't have more than completely keyed on, or less than not keyed
> > on at all)
> >
> >>> +        psnr->max[1] = psnr->max[2] = 240;
> >>
> >>
> >>> +        break;
> >>> +    default:
> >>> +        psnr->max[0] = psnr->max[1] = psnr->max[2] = psnr->max[3] = 255;
> >>> +    }
> >>> +
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >

> From e6725018076620de4abfc0b0179eafdb32b01249 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Roger=20Pau=20Monn=E9?= <roger.pau at entel.upc.edu>
> Date: Tue, 21 Jun 2011 14:40:50 +0200
> Subject: [PATCH] libavfilter: created a new filter that obtains the average peak signal-to-noise ratio (PSNR) of two input video files.
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> 
> Signed-off-by: Roger Pau Monn? <roger.pau at entel.upc.edu>
> ---
>  doc/filters.texi         |   34 +++++
>  libavfilter/Makefile     |    1 +
>  libavfilter/allfilters.c |    1 +
>  libavfilter/vf_psnr.c    |  339 ++++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 375 insertions(+), 0 deletions(-)
>  create mode 100644 libavfilter/vf_psnr.c
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 719d94f..7aefe9b 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -1088,6 +1088,40 @@ format=monow, pixdesctest
>  
>  can be used to test the monowhite pixel format descriptor definition.
>  
> + at section psnr
> +
> +Obtain the average, maximum and minimum PSNR between two input videos.

The explanation of the PSNR acronym may be enclosed in parentheses.

> +Both video files must have the same resolution and pixel format for
> +this filter to work correctly. The obtained average PSNR is printed
> through the logging system.
> +
> +The filter stores the accumulated MSE (mean squared error) of each
> +frame, and at the end of the processing it is averaged across all frames
> +equally, and the following formula is applied to obtain the PSNR:
> +
> + at example
> +PSNR = 10*log10(MAX^2/MSE)
> + at end example
> +
> +Where MAX is the average of the maximum values of each component of the
> +image.
> +

> +This filter accepts the following parameters:

This can be removed

> +
> +This filter accepts in input the filename used to save the PSNR of each
> +individual frame. If not specified the filter will not print the PSNR
> +of each individual frame.
> +
> +For example:
> + at example
> +movie=ref_movie.mpg, setpts=PTS-STARTPTS [ref]; [in] setpts=PTS-STARTPTS,
> +[ref] psnr=stats.log [out]
> + at end example
> +

Note, I wonder if we should rather make this an optional option of the
kind:

psnr=frames_stats=stats.log

so that it's easier to extend the syntax later, e.g.
psnr="frames_stats=stats.log : format=qpsnr"

> +On this example the input file being processed is compared with the
> +reference file ref_movie.mpg. The PSNR of each individual frame is stored

@file{ref_movie.mpg}

> +in stats.log. Setpts filters are used to synchronize both streams.

@file again

> +
>  @section scale
>  
>  Scale the input video to @var{width}:@var{height} and/or convert the image format.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 2324fb9..2c66f7e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -45,6 +45,7 @@ OBJS-$(CONFIG_OCV_FILTER)                    += vf_libopencv.o
>  OBJS-$(CONFIG_OVERLAY_FILTER)                += vf_overlay.o
>  OBJS-$(CONFIG_PAD_FILTER)                    += vf_pad.o
>  OBJS-$(CONFIG_PIXDESCTEST_FILTER)            += vf_pixdesctest.o
> +OBJS-$(CONFIG_PSNR_FILTER)                   += vf_psnr.o
>  OBJS-$(CONFIG_SCALE_FILTER)                  += vf_scale.o
>  OBJS-$(CONFIG_SELECT_FILTER)                 += vf_select.o
>  OBJS-$(CONFIG_SETDAR_FILTER)                 += vf_aspect.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 5f1065f..69a52e1 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -61,6 +61,7 @@ void avfilter_register_all(void)
>      REGISTER_FILTER (OVERLAY,     overlay,     vf);
>      REGISTER_FILTER (PAD,         pad,         vf);
>      REGISTER_FILTER (PIXDESCTEST, pixdesctest, vf);
> +    REGISTER_FILTER (PSNR,        psnr,        vf);
>      REGISTER_FILTER (SCALE,       scale,       vf);
>      REGISTER_FILTER (SELECT,      select,      vf);
>      REGISTER_FILTER (SETDAR,      setdar,      vf);
> diff --git a/libavfilter/vf_psnr.c b/libavfilter/vf_psnr.c
> new file mode 100644
> index 0000000..82aa54d
> --- /dev/null
> +++ b/libavfilter/vf_psnr.c
> @@ -0,0 +1,339 @@
> +/*
> + * Copyright (c) 2011 Roger Pau Monn?? <roger.pau at entel.upc.edu>
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
> + */
> +
> +/**
> + * @file
> + * Caculate the PSNR between two input videos.
> + * Based on the overlay filter.
> + */
> +
> +#include "libavutil/pixdesc.h"
> +#include "avfilter.h"
> +
> +#undef fprintf
> +
> +#define YUV_FORMATS                                         \
> +    PIX_FMT_YUV444P,  PIX_FMT_YUV422P,  PIX_FMT_YUV420P,    \
> +    PIX_FMT_YUV411P,  PIX_FMT_YUV410P,  PIX_FMT_YUV440P,    \
> +    PIX_FMT_YUVA420P,                                       \
> +    PIX_FMT_YUVJ444P, PIX_FMT_YUVJ422P, PIX_FMT_YUVJ420P,   \
> +    PIX_FMT_YUVJ440P
> + 
> +#define RGB_FORMATS                             \
> +    PIX_FMT_ARGB,         PIX_FMT_RGBA,         \
> +    PIX_FMT_ABGR,         PIX_FMT_BGRA,         \
> +    PIX_FMT_RGB24,        PIX_FMT_BGR24
> +
> +static enum PixelFormat yuv_pix_fmts[] = { YUV_FORMATS, PIX_FMT_NONE };
> +static enum PixelFormat rgb_pix_fmts[] = { RGB_FORMATS, PIX_FMT_NONE };
> +static enum PixelFormat all_pix_fmts[] = { RGB_FORMATS, YUV_FORMATS, PIX_FMT_NONE };
> +
> +typedef struct {
> +    AVFilterBufferRef *picref;
> +    double mse, min_mse, max_mse;
> +    int nb_frames;
> +    FILE *vstats_file;
> +    uint16_t *line1, *line2;
> +    int max[4], average_max;
> +    int is_yuv, is_rgb;
> +    int rgba_map[4];
> +} PSNRContext;
> +

> +#define R 0
> +#define G 1
> +#define B 2
> +#define A 3

These can be moved near the place where they are actually used.

> +
> +static int pix_fmt_is_in(enum PixelFormat pix_fmt, enum PixelFormat *pix_fmts)
> +{
> +    enum PixelFormat *p;
> +    for (p = pix_fmts; *p != PIX_FMT_NONE; p++) {
> +        if (pix_fmt == *p)
> +            return 1;
> +    }
> +    return 0;
> +}
> +
> +static inline int pow2(int base)
> +{
> +    return base*base;
> +}
> +
> +static inline double get_psnr(double mse, int nb_frames, int max)
> +{
> +    return 10.0*log((pow2(max))/(mse/nb_frames))/log(10.0);
> +}
> +
> +static inline
> +void compute_images_mse(const uint8_t *ref_data[4],
> +                        const uint8_t *data[4], const int linesizes[4],
> +                        int w, int h, const AVPixFmtDescriptor *desc,
> +                        double mse[4], uint16_t *line1, uint16_t *line2)
> +{
> +    int i, c, j = w;
> +
> +    memset(mse, 0, sizeof(*mse)*4);
> +
> +    for (c = 0; c < desc->nb_components; c++) {
> +        int w1 = c == 1 || c == 2 ? w>>desc->log2_chroma_w : w;
> +        int h1 = c == 1 || c == 2 ? h>>desc->log2_chroma_h : h;
> +
> +        for (i = 0; i < h1; i++) {
> +            av_read_image_line(line1,
> +                               ref_data,
> +                               linesizes,
> +                               desc,
> +                               0, i, c, w1, 0);
> +            av_read_image_line(line2,
> +                               data,
> +                               linesizes,
> +                               desc,
> +                               0, i, c, w1, 0);
> +            for (j = 0; j < w1; j++)
> +                mse[c] += pow2(line1[j] - line2[j]);
> +        }
> +        mse[c] /= w1*h1;
> +    }
> +}
> +
> +
> +static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
> +{
> +    PSNRContext *psnr = ctx->priv;
> +
> +    psnr->mse = psnr->nb_frames = 0;
> +    psnr->min_mse = psnr->max_mse = -1.0;
> +    psnr->picref = NULL;
> +    psnr->line1 = psnr->line2 = NULL;
> +
> +    if (args != NULL && strlen(args) > 0) {
> +        psnr->vstats_file = fopen(args, "w");
> +        if (!psnr->vstats_file) {
> +            av_log(ctx, AV_LOG_ERROR,
> +                   "Could not open stats file %s: %s\n", args, strerror(errno));
> +            return AVERROR(EINVAL);
> +        }
> +    }
> +
> +    psnr->is_yuv = psnr->is_rgb = 0;
> +
> +    return 0;
> +}
> +
> +static av_cold void uninit(AVFilterContext *ctx)
> +{
> +    PSNRContext *psnr = ctx->priv;
> +
> +    av_log(ctx, AV_LOG_INFO, "PSNR average:%0.2fdB min:%0.2fdB max:%0.2fdB\n",
> +           get_psnr(psnr->mse, psnr->nb_frames, psnr->average_max),
> +           get_psnr(psnr->max_mse, 1, psnr->average_max),
> +           get_psnr(psnr->min_mse, 1, psnr->average_max));
> +
> +    if (psnr->picref) {
> +        avfilter_unref_buffer(psnr->picref);
> +        psnr->picref = NULL;
> +    }
> +
> +    av_freep(&psnr->line1);
> +    av_freep(&psnr->line2);
> +
> +    if (psnr->vstats_file)
> +        fclose(psnr->vstats_file);
> +}
> +
> +static int config_input_ref(AVFilterLink *inlink)
> +{
> +    AVFilterContext *ctx  = inlink->dst;
> +    PSNRContext *psnr = ctx->priv;
> +
> +    if (ctx->inputs[0]->w != ctx->inputs[1]->w ||
> +        ctx->inputs[0]->h != ctx->inputs[1]->h) {
> +        av_log(ctx, AV_LOG_ERROR,
> +               "Width and/or heigth of input videos are different, could not calculate PSNR\n");
> +        return AVERROR(EINVAL);
> +    }
> +    if (ctx->inputs[0]->format != ctx->inputs[1]->format) {
> +        av_log(ctx, AV_LOG_ERROR,
> +               "Input filters have different pixel formats, could not calculate PSNR\n");
> +        return AVERROR(EINVAL);
> +    }
> +
> +    if (!(psnr->line1 = av_malloc(sizeof(*psnr->line1) * inlink->w)) ||
> +        !(psnr->line2 = av_malloc(sizeof(*psnr->line2) * inlink->w)))
> +        return AVERROR(ENOMEM);
> +
> +    switch (inlink->format) {
> +    case PIX_FMT_YUV410P:
> +    case PIX_FMT_YUV411P:
> +    case PIX_FMT_YUV420P:
> +    case PIX_FMT_YUV422P:
> +    case PIX_FMT_YUV440P:
> +    case PIX_FMT_YUV444P:
> +    case PIX_FMT_YUVA420P:
> +        psnr->max[0] = 235;
> +        psnr->max[3] = 255;
> +        psnr->max[1] = psnr->max[2] = 240;
> +        break;
> +    default:
> +        psnr->max[0] = psnr->max[1] = psnr->max[2] = psnr->max[3] = 255;
> +    }
> +
> +    if      (pix_fmt_is_in(inlink->format, yuv_pix_fmts)) psnr->is_yuv = 1;
> +    else if (pix_fmt_is_in(inlink->format, rgb_pix_fmts)) psnr->is_rgb = 1;
> +
> +    if (psnr->is_rgb) {
> +            switch (inlink->format) {
> +            case PIX_FMT_ARGB:  psnr->rgba_map[A] = 0; psnr->rgba_map[R] = 1; psnr->rgba_map[G] = 2; psnr->rgba_map[B] = 3; break;
> +            case PIX_FMT_ABGR:  psnr->rgba_map[A] = 0; psnr->rgba_map[B] = 1; psnr->rgba_map[G] = 2; psnr->rgba_map[R] = 3; break;
> +            case PIX_FMT_RGBA:
> +            case PIX_FMT_RGB24: psnr->rgba_map[R] = 0; psnr->rgba_map[G] = 1; psnr->rgba_map[B] = 2; psnr->rgba_map[A] = 3; break;
> +            case PIX_FMT_BGRA:
> +            case PIX_FMT_BGR24: psnr->rgba_map[B] = 0; psnr->rgba_map[G] = 1; psnr->rgba_map[R] = 2; psnr->rgba_map[A] = 3; break;
> +            }
> +        }
> +
> +    for(int j = 0; j < av_pix_fmt_descriptors[inlink->format].nb_components; j++)
> +        psnr->average_max += psnr->max[j];
> +    psnr->average_max /= av_pix_fmt_descriptors[inlink->format].nb_components;
> +
> +    return 0;
> +}
> +
> +static int query_formats(AVFilterContext *ctx)
> +{
> +    avfilter_set_common_formats(ctx, avfilter_make_format_list(all_pix_fmts));
> +    return 0;
> +}
> +
> +static void start_frame(AVFilterLink *inlink, AVFilterBufferRef *inpicref)
> +{
> +    AVFilterBufferRef *outpicref = avfilter_ref_buffer(inpicref, ~0);
> +    AVFilterContext *ctx = inlink->dst;
> +    PSNRContext *psnr = ctx->priv;
> +
> +    inlink->dst->outputs[0]->out_buf = outpicref;
> +    outpicref->pts = av_rescale_q(inpicref->pts,
> +                                  ctx->inputs [0]->time_base,
> +                                  ctx->outputs[0]->time_base);
> +
> +    if (psnr->picref) {
> +        avfilter_unref_buffer(psnr->picref);
> +        psnr->picref = NULL;
> +    }
> +    avfilter_request_frame(ctx->inputs[1]);
> +
> +    avfilter_start_frame(inlink->dst->outputs[0], outpicref);
> +}
> +
> +static void start_frame_ref(AVFilterLink *inlink, AVFilterBufferRef *inpicref)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    PSNRContext *psnr = ctx->priv;
> +
> +    psnr->picref = inpicref;
> +    psnr->picref->pts = av_rescale_q(inpicref->pts,
> +                                     ctx->inputs [1]->time_base,
> +                                     ctx->outputs[0]->time_base);
> +}
> +
> +static void end_frame(AVFilterLink *inlink)
> +{
> +    AVFilterContext *ctx = inlink->dst;
> +    PSNRContext *psnr = ctx->priv;
> +    AVFilterLink *outlink = ctx->outputs[0];
> +    AVFilterBufferRef *outpic = outlink->out_buf;
> +    AVFilterBufferRef *ref = psnr->picref;
> +    double mse[4];
> +    const char *comps[4];
> +    double mse_t = 0;
> +    int j,c;
> +
> +    if (psnr->picref) {
> +        compute_images_mse((const uint8_t **)outpic->data, (const uint8_t **)ref->data,
> +                           outpic->linesize, outpic->video->w, outpic->video->h,
> +                           &av_pix_fmt_descriptors[inlink->format], mse,
> +                           psnr->line1, psnr->line2);
> +
> +        for (j = 0; j < av_pix_fmt_descriptors[inlink->format].nb_components; j++)
> +            mse_t += mse[j];
> +        mse_t /= av_pix_fmt_descriptors[inlink->format].nb_components;
> +
> +        if (psnr->min_mse == -1) {
> +            psnr->min_mse = mse_t;
> +            psnr->max_mse = mse_t;
> +        }
> +        if (psnr->min_mse > mse_t)
> +            psnr->min_mse = mse_t;
> +        if (psnr->max_mse < mse_t)
> +            psnr->max_mse = mse_t;
> +
> +        psnr->mse += mse_t;
> +        psnr->nb_frames++;
> +
> +        if (psnr->vstats_file) {
> +            comps[0] = psnr->is_yuv ? "Y"  : "R" ;
> +            comps[1] = psnr->is_yuv ? "Cb" : "G" ;
> +            comps[2] = psnr->is_yuv ? "Cr" : "B" ;
> +            comps[3] = "A";
> +            fprintf(psnr->vstats_file, "Frame:%d ", psnr->nb_frames);
> +            for (j = 0; j < av_pix_fmt_descriptors[inlink->format].nb_components; j++) {

> +               c = psnr->is_rgb ? psnr->rgba_map[j] : j;
> +               fprintf(psnr->vstats_file, "%s:%0.2fdB ", comps[c], get_psnr(mse[c], 1, psnr->max[c]));

weird indent

> +            }
> +            fprintf(psnr->vstats_file, "\n");
> +        }
> +    }
> +
> +    avfilter_end_frame(outlink);

> +    avfilter_unref_buffer(inlink->cur_buf);

inlink->cur_buf = NULL => safer

(or we could change avfilter_unref_buffer to accepts a ** and set the
pointer reference to NULL).

Looks fine otherwise. I'll fix the nits myself if you don't have time
or you don't care, as for the syntax thing I'd like to think a bit
about it and get advices.
-- 
FFmpeg = Fantastic & Fast Mortal Peaceful Elaborated Gadget


More information about the ffmpeg-devel mailing list