[FFmpeg-devel] [PATCH] movie video source

Alexander Strange astrange
Sat Jan 29 23:25:41 CET 2011


On Jan 29, 2011, at 7:41 AM, Stefano Sabatini wrote:

> On date Friday 2011-01-28 20:49:55 +0100, Michael Niedermayer encoded:
>> On Fri, Jan 28, 2011 at 12:27:45AM +0100, Stefano Sabatini wrote:
>>> On date Sunday 2011-01-16 19:35:33 +0100, Stefano Sabatini encoded:
>>>> On date Thursday 2011-01-06 22:55:18 +0100, Michael Niedermayer encoded:
>>>>> On Thu, Jan 06, 2011 at 09:43:00PM +0100, Stefano Sabatini wrote:
>>>>>> On date Tuesday 2011-01-04 23:48:24 +0100, Stefano Sabatini encoded:
>>>>>>> On date Tuesday 2011-01-04 18:48:48 +0100, Nicolas George encoded:
>>>>>>>> Le quintidi 15 niv?se, an CCXIX, Stefano Sabatini a ?crit :
>>>>>>>>> What do you suggest to do, to make the cmdutils.c PtsCorrectionContext
>>>>>>>>> API public and use it (av_pts_correction_init(),
>>>>>>>>> av_pts_correction_guess()) or to duplicate the code?
>>>>>>>> 
>>>>>>>> Something like that was evoked three weeks ago in the thread about the API
>>>>>>>> example for lavfi. Since then, I thought I might try to implement it.
>>>>>>>> 
>>>>>>>> My idea was to add the PtsCorrectionContext structure to the end of
>>>>>>>> AVCodecContext and add some sort of avcodec_set_stream_time_base function:
>>>>>>>> this way, avcodec_decode_video2 would be able to use the pts and dts fields
>>>>>>>> in AVPacket directly without depending on libavformat.
>>>>>>> 
>>>>>>> My idea was to implement a dumb transposition of the correct_pts API
>>>>>>> in cmdutils.c:
>>>>>>> 
>>>>>>> AVPTSCorrectionContext
>>>>>>> void av_pts_correction_init(AVPTSCorrectionContext *ctx);
>>>>>>> int64_t av_pts_correction_guess(AVPTSCorrectionContext *ctx, int64_t pts, int64_t dts);
>>>>>>> 
>>>>>>> this would be defined in libavcodec/avcodec.h (as I don't see any
>>>>>>> dependency on libavformat).
>>>>>> 
>>>>>> Michael? (I'm not going to even start coding if the design will be
>>>>>> rejected).
>>>>> 
>>>>> What is missing primarely is that pts from the demuxer must be routed through
>>>>> the decoders reordered_opaque. (that is current API)
>>>>> That is needed or things plain and simple are wrong.
>>>>> 
>>>>> after that we are in API change land and i dont think this is a prerequesit to
>>>>> this patch, just a nice to have.
>>>>> 
>>>>> The idea of putting a opaque PTSCorrectionContext in AVCodecContext and using
>>>>> the AVPacket.pts/dts for it and exporting the resulting single
>>>>> (stream) timestamp in AVFrame sounds very nice.
>>>>> Why a avcodec_set_stream_time_base() would be needed is not clear to me though
>>>>> 
>>>>> About making the current cmdutils code public, i think Nicolas idea is much
>>>>> nicer and simpler from a user applications point of view, of course it is more
>>>>> work to implement though ..
>>>> 
>>>> Updated work in progress, I managed the PTS correction stuff the ugly
>>>> way, simply copying all the logic from ffplay.c/cmdutils.c.
>>>> 
>>>> Also I'm not sure that the way movie->picref is created with
>>>> avfilter_get_video_buffer_ref_from_arrays() is safe.
>>> 
>>> Updated with some fixes, depends on the rawdec.c patch I posted today.
>>> 
>>> Also I don't like at all the decoder_reorder_pts option as it sounds
>>> overly complicated, and I would rather prefer an API solution (e.g. to
>>> push the PtsCorrectionContext thing down into the library).
>>> 
>>> Nicolas, Michael, everyone... suggestions?
>>> -- 
>>> FFmpeg = Fierce and Foolish Most Portable Evangelical Guru
>> 
>>> doc/filters.texi         |   54 +++++++
>>> libavfilter/Makefile     |    2 
>>> libavfilter/allfilters.c |    1 
>>> libavfilter/vsrc_movie.c |  350 +++++++++++++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 407 insertions(+)
>>> 9de393d4e8b879af118d878e73860c052715f6f2  0007-Add-movie-video-source.patch
>>> From a018b4bbd785d6fa7ccc8efa9a68198e4ab20ac8 Mon Sep 17 00:00:00 2001
>>> From: Stefano Sabatini <stefano.sabatini-lala at poste.it>
>>> Date: Sun, 31 Oct 2010 17:37:10 +0100
>>> Subject: [PATCH] Add movie video source.
>>> 
>>> ---
>>> doc/filters.texi         |   54 +++++++
>>> libavfilter/Makefile     |    2 +
>>> libavfilter/allfilters.c |    1 +
>>> libavfilter/vsrc_movie.c |  350 ++++++++++++++++++++++++++++++++++++++++++++++
>>> 4 files changed, 407 insertions(+), 0 deletions(-)
>>> create mode 100644 libavfilter/vsrc_movie.c
>>> 
>>> diff --git a/doc/filters.texi b/doc/filters.texi
>>> index 3842886..c46a82b 100644
>>> --- a/doc/filters.texi
>>> +++ b/doc/filters.texi
>>> @@ -1088,6 +1088,60 @@ to the pad with identifier "in".
>>> "color=red@@0.2:qcif:10 [color]; [in][color] overlay [out]"
>>> @end example
>>> 
>>> + at section movie
>>> +
>>> +Read a video stream from a movie container.
>>> +
>>> +It accepts the syntax: @var{movie_name}[@var{options}] where
>>> + at var{movie_name} is the name of the reource to read (not necessarily a
>>> +file but also a device or a stream accessed through some protocol),
>>> +
>>> +and @var{options} is a sequence of @var{key}=@var{value} pairs,
>>> +separated by ":".
>>> +
>>> +The description of the accepted options follows.
>>> +
>>> + at table @option
>>> +
>>> + at item format_name, f
>>> +Specifies the format assumed for the movie to read, and can be either
>>> +the name of a container or an input device. If not specified the
>>> +format is guessed from @var{movie_name} or by probing.
>>> +
>>> + at item seek_point, sp
>>> +Specifies the seek point in seconds, the frames will be output
>>> +starting from this seek point, the parameter is evaluated with
>>> + at code{av_strtod} so the numerical value may be suffixed by an IS
>>> +postfix.
>>> +
>>> + at item stream_index, si
>>> +Specifies the index of the video stream to read. If the value is -1,
>>> +the best suited video stream will be automatically selected. Default
>>> +value is "-1".
>>> +
>>> + at end table
>>> +
>>> +This filter allows to overlay a second movie on top of main input of
>>> +a filtergraph as shown in this graph:
>>> + at example
>>> +input -----------> deltapts0 --> overlay --> output
>>> +                                    ^
>>> +                                    |
>>> +movie --> scale--> deltapts1 -------+
>>> + at end example
>>> +
>>> +Some examples follow:
>>> + at example
>>> +# skip 3.2 seconds from the start of the avi file in.avi, and overlay it
>>> +# on top of the input labelled as "in".
>>> +"movie=in.avi:seek_point=3.2, scale=180:-1, setpts=PTS-STARTPTS [movie]; [in] setpts=PTS-START_PTS, [movie] overlay=16:16 [out]"
>>> +
>>> +# read from a video4linux2 device, and overlay it on top of the input
>>> +# labelled as "in"
>>> +"movie=/dev/video0:f=video4linux2, scale=180:-1, setpts=PTS-STARTPTS [movie]; [in] setpts=PTS-START_PTS, [movie] overlay=16:16 [out]"
>>> +
>>> + at end example
>>> +
>>> @section nullsrc
>>> 
>>> Null video source, never return images. It is mainly useful as a
>>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
>>> index 9304c19..1ac9b0c 100644
>>> --- a/libavfilter/Makefile
>>> +++ b/libavfilter/Makefile
>>> @@ -2,6 +2,7 @@ include $(SUBDIR)../config.mak
>>> 
>>> NAME = avfilter
>>> FFLIBS = avcore avutil
>>> +FFLIBS-$(CONFIG_MOVIE_FILTER) += avformat avcodec
>>> FFLIBS-$(CONFIG_SCALE_FILTER) += swscale
>>> 
>>> HEADERS = avfilter.h avfiltergraph.h
>>> @@ -51,6 +52,7 @@ OBJS-$(CONFIG_YADIF_FILTER)                  += vf_yadif.o
>>> OBJS-$(CONFIG_BUFFER_FILTER)                 += vsrc_buffer.o
>>> OBJS-$(CONFIG_COLOR_FILTER)                  += vf_pad.o
>>> OBJS-$(CONFIG_FREI0R_SRC_FILTER)             += vf_frei0r.o
>>> +OBJS-$(CONFIG_MOVIE_FILTER)                  += vsrc_movie.o
>>> OBJS-$(CONFIG_NULLSRC_FILTER)                += vsrc_nullsrc.o
>>> 
>>> OBJS-$(CONFIG_NULLSINK_FILTER)               += vsink_nullsink.o
>>> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
>>> index b3e6fc5..29a9d68 100644
>>> --- a/libavfilter/allfilters.c
>>> +++ b/libavfilter/allfilters.c
>>> @@ -72,6 +72,7 @@ void avfilter_register_all(void)
>>>     REGISTER_FILTER (BUFFER,      buffer,      vsrc);
>>>     REGISTER_FILTER (COLOR,       color,       vsrc);
>>>     REGISTER_FILTER (FREI0R,      frei0r_src,  vsrc);
>>> +    REGISTER_FILTER (MOVIE,       movie,       vsrc);
>>>     REGISTER_FILTER (NULLSRC,     nullsrc,     vsrc);
>>> 
>>>     REGISTER_FILTER (NULLSINK,    nullsink,    vsink);
>>> diff --git a/libavfilter/vsrc_movie.c b/libavfilter/vsrc_movie.c
>>> new file mode 100644
>>> index 0000000..c55e65d
>>> --- /dev/null
>>> +++ b/libavfilter/vsrc_movie.c
>>> @@ -0,0 +1,350 @@
>>> +/*
>>> + * Copyright (c) 2010 Stefano Sabatini
>>> + * Copyright (c) 2008 Victor Paesa
>>> + *
>>> + * This file is part of FFmpeg.
>>> + *
>>> + * FFmpeg is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU Lesser General Public
>>> + * License as published by the Free Software Foundation; either
>>> + * version 2.1 of the License, or (at your option) any later version.
>>> + *
>>> + * FFmpeg is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public
>>> + * License along with FFmpeg; if not, write to the Free Software
>>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>>> + */
>>> +
>>> +/**
>>> + * @file
>>> + * movie video source
>>> + *
>>> + * @todo use direct rendering (no allocation of a new frame)
>>> + * @todo support more than one output stream
>>> + */
>>> +
>>> +#include <float.h>
>>> +#include "libavutil/avstring.h"
>>> +#include "libavutil/opt.h"
>>> +#include "libavcore/imgutils.h"
>>> +#include "libavformat/avformat.h"
>>> +#include "avfilter.h"
>>> +
>>> +/* #define DEBUG */
>>> +
>> 
>>> +typedef struct {
>>> +    int64_t num_faulty_pts;  /// number of incorrect PTS values so far
>>> +    int64_t num_faulty_dts;  /// number of incorrect DTS values so far
>>> +    int64_t last_pts;        /// PTS of the last frame
>>> +    int64_t last_dts;        /// DTS of the last frame
>>> +} PtsCorrectionContext;
>>> +
>>> +static void init_pts_correction(PtsCorrectionContext *ctx)
>>> +{
>>> +    ctx->num_faulty_pts = ctx->num_faulty_dts = 0;
>>> +    ctx->last_pts = ctx->last_dts = INT64_MIN;
>>> +}
>>> +
>>> +static int64_t guess_correct_pts(PtsCorrectionContext *ctx,
>>> +                                 int64_t reordered_pts, int64_t dts)
>>> +{
>>> +    int64_t pts = AV_NOPTS_VALUE;
>>> +
>>> +    if (dts != AV_NOPTS_VALUE) {
>>> +        ctx->num_faulty_dts += dts <= ctx->last_dts;
>>> +        ctx->last_dts = dts;
>>> +    }
>>> +    if (reordered_pts != AV_NOPTS_VALUE) {
>>> +        ctx->num_faulty_pts += reordered_pts <= ctx->last_pts;
>>> +        ctx->last_pts = reordered_pts;
>>> +    }
>>> +    if ((ctx->num_faulty_pts<=ctx->num_faulty_dts || dts == AV_NOPTS_VALUE)
>>> +       && reordered_pts != AV_NOPTS_VALUE)
>>> +        pts = reordered_pts;
>>> +    else
>>> +        pts = dts;
>>> +
>>> +    return pts;
>>> +}
>> 
>> cant we wait for nicolas implementing this cleanly instead of duplicating
>> it here?
>> 
>> You can just throw this out and use pts if available and if not dts
>> until the API is cleanly implemented
>> 
>> 
>> [...]
>>> +
>>> +static int movie_init(AVFilterContext *ctx)
>>> +{
>>> +    MovieContext *movie = ctx->priv;
>>> +    AVInputFormat *iformat = NULL;
>>> +    AVCodec *codec;
>>> +    int ret;
>>> +    int64_t timestamp;
>>> +
>>> +    av_register_all();
>>> +
>>> +    // Try to find the movie format (container)
>>> +    iformat = movie->format_name ? av_find_input_format(movie->format_name) : NULL;
>>> +
>>> +    movie->format_ctx = NULL;
>>> +    if ((ret = av_open_input_file(&movie->format_ctx, movie->file_name, iformat, 0, NULL)) < 0) {
>>> +        av_log(ctx, AV_LOG_ERROR,
>>> +               "Failed to av_open_input_file '%s'\n", movie->file_name);
>>> +        return ret;
>>> +    }
>> 
>>> +    if ((ret = av_find_stream_info(movie->format_ctx)) < 0) {
>>> +        av_log(ctx, AV_LOG_ERROR, "Failed to find stream info\n");
>>> +        return ret;
>>> +    }
>> 
>> this isnt fatal in rare cases
>> (for example in case of url_interrupt_cb())
>> 
>> 
>>> +
>>> +    // if seeking requested, we execute it
>>> +    if (movie->seek_point > 0) {
>>> +        timestamp = movie->seek_point;
>>> +        // add the stream start time, should it exist
>>> +        if (movie->format_ctx->start_time != AV_NOPTS_VALUE)
>>> +            timestamp += movie->format_ctx->start_time;
>>> +        if (av_seek_frame(movie->format_ctx, -1, timestamp, AVSEEK_FLAG_BACKWARD) < 0) {
>>> +            av_log(ctx, AV_LOG_ERROR, "%s: could not seek to position %"PRId64"\n",
>>> +                   movie->file_name, timestamp);
>>> +        }
>>> +    }
>>> +
>>> +    /* select the video stream */
>>> +    if ((ret = av_find_best_stream(movie->format_ctx, AVMEDIA_TYPE_VIDEO,
>>> +                                   movie->stream_index, -1, NULL, 0)) < 0) {
>>> +        av_log(ctx, AV_LOG_ERROR, "No video stream with index '%d' found\n",
>>> +               movie->stream_index);
>>> +        return ret;
>>> +    }
>>> +    movie->stream_index = ret;
>>> +    movie->codec_ctx = movie->format_ctx->streams[movie->stream_index]->codec;
>>> +
>>> +    /*
>>> +     * So now we've got a pointer to the so-called codec context for our video
>>> +     * stream, but we still have to find the actual codec and open it.
>>> +     */
>>> +    codec = avcodec_find_decoder(movie->codec_ctx->codec_id);
>>> +    if (!codec) {
>>> +        av_log(ctx, AV_LOG_ERROR, "Failed to find any codec\n");
>>> +        return AVERROR(EINVAL);
>>> +    }
>>> +
>>> +    if ((ret = avcodec_open(movie->codec_ctx, codec)) < 0) {
>>> +        av_log(ctx, AV_LOG_ERROR, "Failed to open codec\n");
>>> +        return ret;
>>> +    }
>>> +
>>> +    if (!(movie->frame = avcodec_alloc_frame()) ) {
>>> +        av_log(ctx, AV_LOG_ERROR, "Failed to alloc frame\n");
>>> +        return AVERROR(ENOMEM);
>>> +    }
>>> +
>>> +    movie->w = movie->codec_ctx->width;
>>> +    movie->h = movie->codec_ctx->height;
>>> +
>>> +    av_log(ctx, AV_LOG_INFO, "seek_point:%lld format_name:%s file_name:%s stream_index:%d\n",
>>> +           movie->seek_point, movie->format_name, movie->file_name,
>>> +           movie->stream_index);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static av_cold int init(AVFilterContext *ctx, const char *args, void *opaque)
>>> +{
>>> +    MovieContext *movie = ctx->priv;
>>> +    int ret;
>>> +    movie->class = &movie_class;
>>> +    av_opt_set_defaults2(movie, 0, 0);
>>> +
>>> +    if (args) {
>>> +        movie->file_name = av_get_token(&args, ":");
>>> +        if (*args++ == ':' && (ret = (av_set_options_string(movie, args, "=", ":"))) < 0) {
>>                                        ^
>> useless ()
>> 
>> 
>>> +            av_log(ctx, AV_LOG_ERROR, "Error parsing options string: '%s'\n", args);
>>> +            return ret;
>>> +        }
>>> +    }
>>> +    if (!movie->file_name) {
>>> +        av_log(ctx, AV_LOG_ERROR, "No filename provided!\n");
>>> +        return AVERROR(EINVAL);
>>> +    }
>>> +
>>> +    if (movie->seek_point_d > (INT64_MAX-0.5) / 1000000) {
>>> +        av_log(ctx, AV_LOG_ERROR, "Value for seek point is too big\n");
>>> +        return AVERROR(EINVAL);
>>> +    }
>>> +    movie->seek_point = (int64_t)(movie->seek_point_d * 1000000 + 0.5);
>>> +
>>> +    init_pts_correction(&movie->pts_correction_ctx);
>>> +
>>> +    return movie_init(ctx);
>>> +}
>>> +
>> [...]
>>> +static int movie_get_frame(AVFilterLink *outlink)
>>> +{
>>> +    MovieContext *movie = outlink->src->priv;
>>> +    AVPacket pkt;
>>> +    int ret, frame_decoded;
>>> +
>>> +    if (movie->is_done == 1)
>>> +        return 0;
>>> +
>>> +    while ((ret = av_read_frame(movie->format_ctx, &pkt)) >= 0) {
>>> +        // Is this a packet from the video stream?
>>> +        if (pkt.stream_index == movie->stream_index) {
>>> +            avcodec_decode_video2(movie->codec_ctx, movie->frame, &frame_decoded, &pkt);
>>> +
>>> +            if (frame_decoded) {
>>> +                /* FIXME: avoid the memcpy */
>>> +                movie->picref = avfilter_get_video_buffer(outlink, AV_PERM_WRITE | AV_PERM_PRESERVE |
>>> +                                                          AV_PERM_REUSE2, outlink->w, outlink->h);
>>> +                av_image_copy(movie->picref->data, movie->picref->linesize,
>>> +                              movie->frame->data,  movie->frame->linesize,
>>> +                              movie->picref->format, outlink->w, outlink->h);
>>> +
>>> +                if        (movie->decoder_reorder_pts == 1) {
>>> +                    movie->picref->pts = movie->frame->pkt_pts;
>>> +                } else if (movie->decoder_reorder_pts == 0) {
>> 
>>> +                    movie->picref->pts = pkt.dts;
>> 
>> movie->frame->pkt_dts
>> 
> 
> /* FIXME: use a PTS correction mechanism as that in 
>   ffplay.c when an API will be available */
> /* use pkt.dts if pts is not available */
> movie->picref->pts = movie->frame->pkt_pts == AV_NOPTS_VALUE ?
>    pkt.dts : movie->frame->pkt_pts;

Like he said, it's movie->frame->pkt_dts now, not pkt.dts. No difference yet, but pkt.dts won't be reliable after mt.

BTW the missing part of guess_correct_pts(), which is the reason that I didn't move it outside cmdutils, is this part of ffmpeg.c:
                    ist->next_pts = ist->pts = guess_correct_pts(&ist->pts_ctx, picture.reordered_opaque, ist->pts);
                    if (ist->st->codec->time_base.num != 0) {
                        int ticks= ist->st->parser ? ist->st->parser->repeat_pict+1 : ist->st->codec->ticks_per_frame;
                        ist->next_pts += ((int64_t)AV_TIME_BASE *
                                          ist->st->codec->time_base.num * ticks) /
                            ist->st->codec->time_base.den;
                    }

which, if the input pts is all AV_NOPTS_VALUE, advances the last returned time by the timebase and returns that. ffplay is missing this, so if frames get all the way through with AV_NOPTS_VALUE it ignores the stream's frame rate and just displays them at some default duration. Note that it probably works better using r_frame_rate instead of time_base.

I've forgotten a few details, which is maybe why I didn't finish the patch series.



More information about the ffmpeg-devel mailing list