[FFmpeg-devel] [PATCH][RFC] Add example seeking_while_remuxing.c

Tue Jan 28 01:19:03 CET 2014

On date Monday 2014-01-27 04:58:00 +0200, Andrey Utkin encoded:
> I was requested to make up such a demo of ffmpeg API, showing how to seek while
> serving "client connection". It turned out it's not so easy. I started to
> recall a hell of issues i got into during work on input streams fallback.
> These memories resulted in large comments.
> 
> I'm still not perfectly sure in all my statements in this snippet, although it
> feels close to truth in the whole. To make it perfect and beneficial to all, i
> decided to propose it for review and inclusion to official examples.
> 
> I would be glad to hear opinions about used and described approaches, and all
> statements in the comments.
> 
> In this example, I have filtered out completely all the packets with dts or pts
> == AV_NOPTS_VALUE.  As stated in comment, they pass into muxing fine if they're
> at the very beginning of the stream, but they result in error when they're fed
> to muxer in the middle of stream (e.g. just after the seek, exactly it happens
> with Matroska input). I don't know what to do with them otherwise, but
> it obviously results in loss of data - seems it can contain video keyframes.
> 
> I am worried a lot by finding stable general-use approaches to resolve
> timestamp discontinuity caused by seeking, for all grades of accuracy.
> Current patch proposes approach with worst possible accuracy, but with
> applicability for general case and requiring no decoding or reencoding.
> 

> OFFTOP: Would anybody be interested if i prepare similarly a showcase of
> accurate applying of ffmpeg filters in the middle of videostream, without
> reencoding all the stream?

Ideally we should support auto-reconfiguration, and avoid users to
re-implement that again and again at the application level.

The problem with that is that the current API seems not designed to
allow that easily (if at all), since *format* configuration is
supposed to be done once during filter initialization, and can't be
easily reconfigured midstream (supporting resizing should be doable
OTOH, although requires a major effort at the design level).

> 
> ---8<---
> ---
>  doc/examples/seeking_while_remuxing.c | 308 ++++++++++++++++++++++++++++++++++
>  1 file changed, 308 insertions(+)
>  create mode 100644 doc/examples/seeking_while_remuxing.c
> 
> diff --git a/doc/examples/seeking_while_remuxing.c b/doc/examples/seeking_while_remuxing.c
> new file mode 100644
> index 0000000..735cba7
> --- /dev/null
> +++ b/doc/examples/seeking_while_remuxing.c
> @@ -0,0 +1,308 @@
> +/*
> + * Copyright (c) 2014 Andrey Utkin
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +/**
> + * @file
> + * libavformat/libavcodec demuxing, muxing and seeking API example.
> + *
> + * Remux input file to output file up to 'seekfrom' time position, then seeks
> + * to 'seekto' position and continues remuxing. Seek is performed only once
> + * (won't loop).
> + * @example doc/examples/remux_and_seek.c
> + */
> +
> +#include <libavutil/timestamp.h>
> +#include <libavformat/avformat.h>
> +
> +#define YOU_WANT_NO_ERRORS_ABOUT_NON_MONOTONIC_TIMESTAMPS
> +
> +static void log_packet(const AVFormatContext *fmt_ctx, const AVPacket *pkt, const char *tag)
> +{
> +    AVRational *time_base = &fmt_ctx->streams[pkt->stream_index]->time_base;
> +
> +    fprintf(stderr, "%s: pts:%s pts_time:%s dts:%s dts_time:%s duration:%s duration_time:%s stream_index:%d\n",
> +            tag,
> +            av_ts2str(pkt->pts), av_ts2timestr(pkt->pts, time_base),
> +            av_ts2str(pkt->dts), av_ts2timestr(pkt->dts, time_base),
> +            av_ts2str(pkt->duration), av_ts2timestr(pkt->duration, time_base),
> +            pkt->stream_index);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    AVFormatContext *ifmt_ctx = NULL, *ofmt_ctx = NULL;
> +
> +    int64_t shift = 0; // Output timestamp shift caused by seek.
> +    // In microseconds, 10^-6 of second, which is AV_TIME_BASE_Q
> +
> +    int seek_done = 0;
> +    const char *in_filename, *out_filename, *out_format_name;
> +    int64_t seekfrom, seekto;
> +    int ret;
> +    unsigned int i;
> +
> +    if (argc != 6) {
> +        fprintf(stderr, "Usage: %s <input file> <output file> "
> +                "<output format, or empty for default> "
> +                "<seekfrom: time offset to activate seek, microseconds> "
> +                "<seekto: time offset to seek to, microseconds>\n", argv[0]);
> +        fprintf(stderr, "Remuxes input file to output file up to 'seekfrom' "
> +                "time position, then seeks to 'seekto' position and continues "
> +                "remuxing. Seek is performed only once (won't loop).\n");
> +        return 1;
> +    }
> +
> +    in_filename = argv[1];
> +    out_filename = argv[2];
> +    out_format_name = argv[3];
> +
> +    ret = sscanf(argv[4], "%"PRId64, &seekfrom);
> +    if (ret != 1) {
> +        fprintf(stderr, "Invalid seekfrom %s\n", argv[4]);
> +        return 1;
> +    }
> +
> +    ret = sscanf(argv[5], "%"PRId64, &seekto);
> +    if (ret != 1) {
> +        fprintf(stderr, "Invalid seekto %s\n", argv[5]);
> +        return 1;
> +    }
> +
> +    // Initialize libavformat
> +    av_register_all();
> +    avformat_network_init();
> +
> +    // Open file, init input file context, read file's mediacontainer header.
> +    // Some file and elementary streams information is available after this
> +    if ((ret = avformat_open_input(&ifmt_ctx, in_filename, 0, 0)) < 0) {
> +        fprintf(stderr, "Could not open input file '%s'", in_filename);
> +        goto end;
> +    }
> +
> +    // Reads some amount of file contents to get all information about elementary streams.
> +    // This can be not necessary is some cases, but in general case, this is needed step.
> +    if ((ret = avformat_find_stream_info(ifmt_ctx, 0)) < 0) {
> +        fprintf(stderr, "Failed to retrieve input stream information");
> +        goto end;
> +    }
> +
> +    // Dump input file and its elementary streams properties to stderr
> +    av_dump_format(ifmt_ctx, 0, in_filename, 0);
> +
> +    // Open output context, with specified mediacontainer type if given
> +    ret = avformat_alloc_output_context2(&ofmt_ctx, NULL,
> +            out_format_name[0] ? out_format_name : NULL, out_filename);
> +    if (ret < 0) {
> +        fprintf(stderr, "Failed to open output context by URL %s\n", out_filename);
> +        goto end;
> +    }
> +
> +    // Define for output file same elementary streams as in input file
> +    for (i = 0; i < ifmt_ctx->nb_streams; i++) {
> +        AVStream *in_stream = ifmt_ctx->streams[i];
> +        AVStream *out_stream = avformat_new_stream(ofmt_ctx, in_stream->codec->codec);
> +        if (!out_stream) {
> +            fprintf(stderr, "Failed allocating elementary output stream\n");
> +            ret = AVERROR_UNKNOWN;
> +            goto end;
> +        }
> +
> +        ret = avcodec_copy_context(out_stream->codec, in_stream->codec);
> +        if (ret < 0) {
> +            fprintf(stderr, "Failed to copy elementary stream properties\n");
> +            goto end;
> +        }
> +        if (ofmt_ctx->oformat->flags & AVFMT_GLOBALHEADER)
> +            out_stream->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
> +    }
> +
> +    av_dump_format(ofmt_ctx, 0, out_filename, 1);
> +
> +    // Initializes actual output context on protocol, output device or file level
> +    ret = avio_open(&ofmt_ctx->pb, out_filename, AVIO_FLAG_WRITE);
> +    if (ret < 0) {
> +        fprintf(stderr, "Could not open output to '%s'", out_filename);
> +        goto end;
> +    }
> +
> +    // Last step of output initialization. Mediacontainer format "driver" is
> +    // initialized. This generally leads to writing header data to output file.
> +    ret = avformat_write_header(ofmt_ctx, NULL);
> +    if (ret < 0) {
> +        fprintf(stderr, "Error occurred when opening output file\n");
> +        goto end;
> +    }
> +
> +    // Copy input elementary streams to output at packed frames level.
> +    // This process is known as remuxing (remultiplexing). It consists of
> +    // demultiplexing (demuxing) streams from input and multiplexing (muxing)
> +    // to output.
> +    // No image/sound decoding takes place in this case.
> +    while (1) {
> +        AVPacket pkt;
> +        AVStream *in_stream, *out_stream;
> +        int64_t current_dts_mcs;
> +
> +        memset(&pkt, 0, sizeof(pkt));
> +        ret = av_read_frame(ifmt_ctx, &pkt);
> +        if (ret < 0)
> +            break;
> +
> +        log_packet(ifmt_ctx, &pkt, "in");
> +
> +        if (pkt.dts == AV_NOPTS_VALUE || pkt.pts == AV_NOPTS_VALUE) {
> +            // TODO Decode to figure out timestamps? Anyway, decoding is out of
> +            // scope of this example currently.
> +            //
> +            // Such packets happen to be keyframes in Matroska.
> +            // So dropping them adds up to lost data.
> +            // When they're remuxed at the beginning of stream, it's OK, but
> +            // av_interleaved_write_frame() raises non-monotonity error when
> +            // they're pushed after a seek (i.e. when there were
> +            // correctly-timestamped packets before)
> +            printf("Discarding packet not having timestamps\n");
> +            av_free_packet(&pkt);
> +            continue;
> +        }
> +
> +        in_stream  = ifmt_ctx->streams[pkt.stream_index];
> +        out_stream = ofmt_ctx->streams[pkt.stream_index];
> +
> +        current_dts_mcs = av_rescale_q (pkt.dts, in_stream->time_base, AV_TIME_BASE_Q);
> +
> +        // Check if it's time to seek
> +        if (!seek_done
> +            && current_dts_mcs >= seekfrom) {
> +            av_free_packet(&pkt);
> +            printf("Seeking. Last read packet is discarded\n");
> +            ret = av_seek_frame(ifmt_ctx, -1, seekto, 0);
> +            if (ret) {
> +                fprintf(stderr, "Seeking failed\n");
> +                break;
> +            }
> +            seek_done = 1;
> +            shift = seekfrom - seekto;
> +            continue;
> +        }
> +
> +#ifdef YOU_WANT_NO_ERRORS_ABOUT_NON_MONOTONIC_TIMESTAMPS
> +        if (seek_done && current_dts_mcs < seekto) {
> +            printf("Discarding packet having timestamp lower than needed\n");
> +            av_free_packet(&pkt);
> +            continue;
> +            // Citing official ffmpeg docs:
> +            // "Note the in most formats it is not possible to seek exactly, so
> +            // ffmpeg will seek to the closest seek point before (given)
> +            // position."
> +            //
> +            // To seek exactly (accurately), without possibly losing keyframes
> +            // or introducing desync, and still being safe against timestamps
> +            // monotonity problem, you must reencode part of video after
> +            // seeking point, to make key frame where you want to start
> +            // playback after seeking. You may also want to fill possible time
> +            // gaps with silence (for audio) or duplicating frames (for video)
> +            // to support technically poor playback clients (e.g. Flash
> +            // plugin), and this is also achievable with reencoding.  This is
> +            // simpler if you are already in process of transcoding, not in
> +            // remuxing.
> +            //
> +            // Note. In case of necessity to fill audio gaps (e.g. Flash
> +            // player) and avoid even smallest desync, and if audio output
> +            // encoding does not allow variable frame length, in certain
> +            // situation you may have to go in reencoding mode until the end of
> +            // stream, because you may have timestamp shift not equal to
> +            // multiple of audio frame duration.
> +            //
> +            // Note 2. Audio packets dts and pts do not always accurately
> +            // represent reality. Ultimately accurate accounting of audio data
> +            // duration and time offset can be achieved through accounting
> +            // number of audio samples transmitted.
> +            //
> +            // The most important and practical part:
> +            //
> +            // In this example, for simplicity, we allow possibility of losing
> +            // keyframe (which can in some cases lead to scattered image for
> +            // some period after seeking). Desync is not introduced, because we
> +            // shift all elementary streams timestamps by same offset, although
> +            // see Note 2.
> +            //
> +            // Another technically similar approach is just to push packets
> +            // carelessly into muxer after seeking (with any rough shift
> +            // calculation), ignoring AVERROR(EINVAL) return values from it.
> +            // Well, you'd better ignore such errors anyway, because you can
> +            // have non-monotonic DTS already in input stream, this indeed
> +            // happens on some files. Although you may track timestamps
> +            // yourself to filter out unordered packets or maybe even reorder
> +            // them.
> +            //
> +            // This chosen approach is generally bad, because failing to
> +            // transmit correctly a video keyframe breaks the playback of up to
> +            // several seconds of video. But it is simple and does not require
> +            // anything except basic remuxing.
> +        }
> +#endif
> +
> +        // We rescale timestamps because time units used in input and output
> +        // file formats may differ
> +        // I.e. for MPEG TS, time unit is 1/90000, for FLV it is 1/1000, etc.
> +        pkt.pts = av_rescale_q(pkt.pts, in_stream->time_base, out_stream->time_base)
> +            + av_rescale_q(shift, AV_TIME_BASE_Q, out_stream->time_base);
> +        pkt.dts = av_rescale_q(pkt.dts, in_stream->time_base, out_stream->time_base)
> +            + av_rescale_q(shift, AV_TIME_BASE_Q, out_stream->time_base);
> +
> +        pkt.duration = av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
> +        pkt.pos = -1;
> +        log_packet(ofmt_ctx, &pkt, "out");
> +
> +        ret = av_interleaved_write_frame(ofmt_ctx, &pkt);
> +        if (ret < 0) {
> +            if (ret == AVERROR(EINVAL)) {
> +                printf("Muxing error, presumably of non-monotonic DTS, can be ignored\n");
> +            } else {
> +                fprintf(stderr, "Error muxing packet\n");
> +                break;
> +            }
> +        }
> +        av_free_packet(&pkt);
> +    }
> +
> +    // Deinitialize format driver, finalizes output file/stream appropriately.
> +    av_write_trailer(ofmt_ctx);
> +
> +end:
> +    // Closes input format context and releases related memory
> +    avformat_close_input(&ifmt_ctx);
> +
> +    // Close output file/connection context
> +    if (ofmt_ctx)
> +        avio_close(ofmt_ctx->pb);
> +
> +    // Close format context of output file
> +    avformat_free_context(ofmt_ctx);
> +
> +    // Check if we got here because of error, if so - decode its meaning and report
> +    if (ret < 0 && ret != AVERROR_EOF) {
> +        fprintf(stderr, "Error occurred: %s\n", av_err2str(ret));
> +        return 1;
> +    }
> +    return 0;
> +}

This seems to contain much code from remuxing.c. What if you apply
your changes on top of that? Having many files to maintain means more
work for maintainers.
-- 
FFmpeg = Funny Fascinating Majestic Peaceless Entertaining Gnome