[FFmpeg-devel] [PATCH] First shot at AVCHD seeking via new seeking API

Ivan Schreter schreter
Sun Aug 30 10:39:41 CEST 2009


Michael Niedermayer wrote:
> On Sun, Aug 09, 2009 at 04:50:33PM +0200, Ivan Schreter wrote:
>   
>> Hi Baptiste,
>>
>> Ivan Schreter wrote:
>>     
>>> [...]
>>> Ok, so I'll implement your proposal regarding active flag and the 
>>> optimization for ts == max_ts. Per-stream term position is in my opinion 
>>> still necessary. Will it be OK to commit then?
>>>       
>> Attached is the newest version of the patch.
>>
>> I've addressed these points:
>>
>>    * Removed "active" flag and replaced it with st->discard checking.
>>    * Also removed "found_hi" and "found_lo" flags, as the position can
>>      be used for that.
>>    * Added optimization for ts < ts_max by initializing terminating
>>      condition to ts_max for first iteration. Also, instead of
>>      positions, timestamps are used per-stream as terminating condition
>>      (makes the code for TS and byte seeking exactly the same).
>>    * Removed "fpos" approximation of frame position and instead keeping
>>      last-known position for multi-frame packets.
>>    * Per your request, I added saving/restoring of parser state to
>>      seek.c and calls to it in mpegts.c, so seeking to invalid position
>>      won't invalidate current stream state (or in other words, it will
>>      restore the state just before seeking, so next frame will be read
>>      as if no seek was attempted). This can be used generically, so I
>>      exposed it in seek.h.
>>
>> I hope it's now OK to commit.
>>
>> I also have code handling MPEG-PS ready (using the same method and same 
>> routines in seek.c and basically the same implementation of read_seek2 as 
>> in mpegts.c), but there is the issue of partial frame read, which needs to 
>> be addressed first (another thread on ML).
>>
>> Regards,
>>
>> Ivan
>>
>>     
>
>   
>>  Makefile |    2 
>>  seek.c   |  500 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  seek.h   |   97 ++++++++++++
>>  utils.c  |    2 
>>  4 files changed, 599 insertions(+), 2 deletions(-)
>> 306fe878e49fc2dab68d43d62a7f105c87359bcd  seek_cooked_lavf.patch
>>     
>
>   
>> Index: libavformat/utils.c
>> ===================================================================
>> --- libavformat/utils.c	(revision 19507)
>> +++ libavformat/utils.c	(working copy)
>> @@ -1140,7 +1140,7 @@
>>  /**
>>   * Flush the frame reader.
>>   */
>> -static void av_read_frame_flush(AVFormatContext *s)
>> +void av_read_frame_flush(AVFormatContext *s)
>>  {
>>      AVStream *st;
>>      int i;
>>     
>
> seperate patch
>   
I intended to move the function completely to seek.c in a separate patch.

>
>   
>> Index: libavformat/Makefile
>> ===================================================================
>> --- libavformat/Makefile	(revision 19507)
>> +++ libavformat/Makefile	(working copy)
>> @@ -5,7 +5,7 @@
>>  
>>  HEADERS = avformat.h avio.h
>>  
>> -OBJS = allformats.o cutils.o metadata.o metadata_compat.o options.o os_support.o sdp.o utils.o
>> +OBJS = allformats.o cutils.o metadata.o metadata_compat.o options.o os_support.o sdp.o utils.o seek.o
>>  
>>  # muxers/demuxers
>>  OBJS-$(CONFIG_AAC_DEMUXER)               += raw.o id3v1.o id3v2.o
>> Index: libavformat/seek.c
>> ===================================================================
>> --- libavformat/seek.c	(revision 0)
>> +++ libavformat/seek.c	(revision 0)
>> @@ -0,0 +1,500 @@
>>     
>
> seek.c should be created by forking utils.c via svn cp and later new code
> should be added (svn needs that to keep track of line history)
>   
Mhm... Good idea. But then, the patch will be huge and contain 
effectively several unrelated changes. So I don't know...

>
>   
>> +/*
>> + * Utility functions for seeking for use within FFmpeg format handlers.
>> + *
>> + * Copyright (c) 2009 Ivan Schreter
>> + *
>> + * This file is part of FFmpeg.
>> + *
>> + * FFmpeg is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * FFmpeg is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with FFmpeg; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>> + */
>> +
>> +#include "seek.h"
>> +#include "libavutil/mem.h"
>> +
>> +// NOTE: implementation should be moved here in another patch, to keep patches
>> +// separated.
>> +extern void av_read_frame_flush(AVFormatContext *s);
>> +
>> +/**
>> + * Helper structure to store parser state of AVStream.
>> + */
>> +typedef struct AVStreamState {
>> +    // Saved members of AVStream
>> +    AVCodecParserContext   *parser;
>> +    AVPacket                cur_pkt;
>> +    int64_t                 last_IP_pts;
>> +    int64_t                 cur_dts;
>> +    int64_t                 reference_dts;
>> +    const uint8_t          *cur_ptr;
>> +    int                     cur_len;
>> +    int                     probe_packets;
>> +} AVStreamState;
>>     
>
> storing and restoring parser state also belongs in a seperate patch and
> it requires discussion in a seperate thread and the code belongs, well
> into some parser related file more than a seek related one.
>   
That's true, it is parser-related. I've only added storing/restoring 
state per request from Baptiste.

However, I'm not sure whether we need it at all. When seeking routine 
returns -1, there is no definition in which state the stream is (i.e., 
it's undefined). In the meantime, I suppose, documenting that seeking 
error brings stream into an undefined state would be better than hacks 
with saving/restoring state. We can add saving/restoring state later, 
should someone request it.

What do you think?

> but above all, your code really is ugly, and would add a quite
> substantial burden in terms of maintaince. More precissely every
> field added would have to be added to these structs and copy and
> restore routines would have to be updated.
> This design IS poor. And i doubt it works at all
>   
Yes and no. You are right, currently it _is_ ugly and hard to maintain.

But I was thinking about moving av_read_frame_flush() to live alongside 
with state saving/restoring code in one file. Then, you just need to add 
fields to save/restore if this flush routine changes. Other fields are 
obviously irrelevant. So in that case, it would be relatively easy to 
maintain.

> also, the first obvious way to restore state is a simple seek back
> and demux + parse until the last packet while discardng packets
>   
Possible.

But how much to seek back? Probably again something with backoff would 
be necessary.

Other problem that I see, we actually have to re-locate to the proper 
position in two pass mode:

    * First read what packet follows. Note this down.
    * Then, re-position the file back, read until we find the same
      packet, keeping previous packet information somewhere.
    * Then, re-position again and read until we find the previous packet.

IF we want to implement state saving/restoring, would this be in your 
opinion a viable alternative? I take yes, since it's basically what you 
wrote above, except the 2-pass thing.

>
> [...]
>   
>> +/**
>> + * Partial search for keyframes in multiple streams.
>> + *
>> + * This routine searches for the next lower and next higher timestamp to
>> + * given target timestamp in each stream, starting at current file position
>> + * and ending at position, where all streams have already been examined
>> + * (or when all higher key frames found in first iteration).
>> + *
>> + * This routine is called iteratively with exponential backoff to find lower
>> + * timestamp.
>> + *
>> + * @param s                 format context.
>> + * @param timestamp         target timestamp (or position, if AVSEEK_FLAG_BYTE).
>> + * @param flags             seeking flags.
>> + * @param sync              array with information per stream.
>> + * @param keyframes_to_find count of keyframes to find in total.
>> + * @param found_lo          pointer to count of already found low timestamp keyframes.
>> + * @param found_hi          pointer to count of already found high timestamp keyframes.
>> + * @param first_iter        flag for first iteration.
>> + */
>> +static void search_hi_lo_keyframes(AVFormatContext *s,
>> +                                   int64_t timestamp,
>> +                                   int flags,
>> +                                   AVSyncPoint *sync,
>> +                                   int keyframes_to_find,
>> +                                   int *found_lo,
>> +                                   int *found_hi,
>> +                                   int first_iter)
>> +{
>> +    AVPacket pkt;
>> +    AVSyncPoint *sp;
>> +    AVStream *st;
>> +    int idx;
>> +    int flg;
>> +    int terminated_count = 0;
>> +    int64_t pos;
>> +    int64_t pts, dts;   // PTS/DTS from stream
>> +    int64_t ts;         // PTS in common AV_TIME_BASE or position for byte seeking
>> +
>> +    for (;;) {
>> +        if (av_read_frame(s, &pkt) < 0) {
>> +            // EOF or error, make sure high flags are set
>> +            for (idx = 0; idx < s->nb_streams; ++idx) {
>> +                if (s->streams[idx]->discard < AVDISCARD_ALL) {
>> +                    sp = &sync[idx];
>> +                    if (sp->pos_hi == INT64_MAX) {
>> +                        // No high frame exists for this stream
>> +                        (*found_hi)++;
>> +                        sp->pts_hi = INT64_MAX;
>> +                        sp->pos_hi = INT64_MAX - 1;
>> +                    }
>> +                }
>> +            }
>> +            break;
>> +        }
>> +
>> +        idx = pkt.stream_index;
>> +        st = s->streams[idx];
>> +        if (st->discard >= AVDISCARD_ALL) {
>> +            // This stream is not active, skip packet.
>> +            continue;
>> +        }
>> +        sp = &sync[idx];
>> +
>> +        flg = pkt.flags;
>> +        pos = pkt.pos;
>> +        pts = pkt.pts;
>> +        dts = pkt.dts;
>> +        if (pts == AV_NOPTS_VALUE) {
>> +            // Some formats don't provide PTS, only DTS.
>> +            pts = dts;
>> +        }
>> +        av_free_packet(&pkt);
>> +
>>     
>
>   
>> +        // Multi-frame packets only return position for the very first frame.
>> +        // Other frames are read with position == -1. Therefore, we note down
>>     
>
> isnt that a bug?
>   
No, it's a feature, so it seems.

>
>   
>> +        // last known position of a frame and use it if a frame without
>> +        // position arrives. In this way, it's possible to seek to proper
>> +        // position. Additionally, for parsers not providing position at all,
>> +        // an approximation will be used (starting position of this iteration).
>> +        if (pos < 0) {
>> +            pos = sp->last_pos;
>> +        } else {
>> +            sp->last_pos = pos;
>> +        }
>> +
>> +        // Evaluate key frames with known TS (or any frames, if AVSEEK_FLAG_ANY set).
>> +        if (pts != AV_NOPTS_VALUE && ((flg & PKT_FLAG_KEY) || (flags & AVSEEK_FLAG_ANY))) {
>> +            if (flags & AVSEEK_FLAG_BYTE) {
>> +                // For byte seeking, use position as timestamp.
>> +                ts = pos;
>> +            } else {
>>     
>
>   
>> +                // Rescale stream timestamp to AV_TIME_BASE.
>> +                ts = av_rescale(pts, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
>>     
>
> why? all comparissions are inaccurate after this, didnt you say you
> fixed that?
>   
Um... I did. Seems like you looked in an older version? In committed 
code, those rescales are gone.

The only rescaling is done in mpegts.c in order to compute adjusted 
timestamp for av_gen_search. But this is not used in actual timestamp 
comparison code.

>
>   
>> +            }
>> +
>> +            if (sp->first_ts == AV_NOPTS_VALUE) {
>> +                // Note down termination timestamp for the next iteration - when
>> +                // we encounter a packet with the same timestamp, we will ignore
>> +                // any further packets for this stream in next iteration (as they
>> +                // are already evaluated).
>> +                sp->first_ts = ts;
>> +            }
>> +
>> +            if (sp->term_ts != AV_NOPTS_VALUE && ts > sp->term_ts) {
>> +                // We are past the end position from last iteration, ignore packet.
>> +                if (!sp->terminated) {
>> +                    sp->terminated = 1;
>> +                    ++terminated_count;
>> +                    if (sp->pos_hi == INT64_MAX) {
>> +                        // No high frame exists for this stream
>> +                        (*found_hi)++;
>> +                        sp->pts_hi = INT64_MAX;
>> +                        sp->pos_hi = INT64_MAX - 1;
>> +                    }
>> +                    if (terminated_count == keyframes_to_find)
>> +                        break;  // all terminated, iteration done
>> +                }
>> +                continue;
>> +            }
>> +
>> +            if (ts <= timestamp) {
>> +                // Keyframe found before target timestamp.
>>     
>
>   
>> +                if (sp->pos_lo == INT64_MAX) {
>> +                    // Found first keyframe lower than target timestamp.
>> +                    (*found_lo)++;
>> +                    sp->pts_lo = ts;
>> +                    sp->pos_lo = pos;
>> +                } else if (sp->pts_lo < ts) {
>> +                    // Found a better match (closer to target timestamp).
>> +                    sp->pts_lo = ts;
>> +                    sp->pos_lo = pos;
>> +                }
>>     
>
> duplicate
>   
Not really, note (*found_lo)++. I commented this in Baptiste's mail already.

>
>   
>> +            }
>> +            if (ts >= timestamp) {
>> +                // Keyframe found after target timestamp.
>> +                if (sp->pos_hi == INT64_MAX) {
>> +                    // Found first keyframe higher than target timestamp.
>> +                    (*found_hi)++;
>> +                    sp->pts_hi = ts;
>> +                    sp->pos_hi = pos;
>> +                    if (*found_hi >= keyframes_to_find && first_iter) {
>> +                        // We found high frame for all. They may get updated
>> +                        // to TS closer to target TS in later iterations (which
>> +                        // will stop at start position of previous iteration).
>> +                        break;
>> +                    }
>> +                } else if (sp->pts_hi > ts) {
>> +                    // Found a better match (actually, shouldn't happen).
>> +                    sp->pts_hi = ts;
>> +                    sp->pos_hi = pos;
>> +                }
>>     
>
> same issue
>
>
>   
>> +            }
>> +        }
>> +    }
>> +
>> +    // Clean up the parser.
>> +    av_read_frame_flush(s);
>> +}
>> +
>> +int64_t ff_gen_syncpoint_search(AVFormatContext *s,
>> +                                int stream_index,
>> +                                int64_t pos,
>> +                                int64_t ts_min,
>> +                                int64_t ts,
>> +                                int64_t ts_max,
>> +                                int flags)
>> +{
>> +    AVSyncPoint *sync, *sp;
>> +    AVStream *st;
>> +    int i;
>> +    int keyframes_to_find = 0;
>> +    int64_t curpos;
>> +    int64_t step;
>> +    int found_lo = 0, found_hi = 0;
>> +    int64_t min_distance;
>> +    int64_t min_pos = 0;
>> +    int first_iter = 1;
>> +
>>     
>
>   
>> +    if (stream_index >= 0 && !(flags & AVSEEK_FLAG_BYTE)) {
>> +        // Rescale timestamps to AV_TIME_BASE, if timestamp of a reference stream given.
>> +        st = s->streams[stream_index];
>> +        if (ts != AV_NOPTS_VALUE)
>> +            ts = av_rescale(ts, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
>> +        if (ts_min != INT64_MIN)
>> +            ts_min = av_rescale(ts_min, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
>> +        if (ts_max != INT64_MAX)
>> +            ts_max = av_rescale(ts_max, AV_TIME_BASE * (int64_t)st->time_base.num, st->time_base.den);
>> +    }
>>     
>
> more rescaling that probably shouldnt be ...
>   
This is gone.

>
>   
>> +
>> +    // Initialize syncpoint structures for each stream.
>> +    sync = (AVSyncPoint*) av_malloc(s->nb_streams * sizeof(AVSyncPoint));
>>     
>
> I think you did not implement a syncpoint cache and you completely ignore
> the existing AVIndexes
>   
Yes and yes.

There is no need for syncpoint cache, since it simply searches for a 
frame from the position found by rough repositioning routine (in MPEG-TS 
case, it's just finding a closest MPEG-TS packet having a timestamp with 
timestamp as close as possible to the specified one by bisection in 
O(log n) time). Then, the proper frame is picked where to start decoding 
in order to synchronize at specified sync point.

There is also no need to handle AVIndex, since this is done by the rough 
repositioning routine, if needed. But this code was actually meant 
primarily for formats NOT using AVIndex. Formats having AVIndex can take 
advantage of it.

My code is basically an extension of av_gen_search() (which is actually 
also inaccurate for old seeking API, if read_timestamp doesn't find a 
key frame). For formats having AVIndex, an extension to 
av_seek_frame_binary() is needed.

Regards,

Ivan




More information about the ffmpeg-devel mailing list