[FFmpeg-devel] [PATCH 1/3] Add av_build_index

Michael Chinen mchinen
Thu Aug 19 10:33:51 CEST 2010


On Tue, Aug 17, 2010 at 1:55 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Sun, Aug 15, 2010 at 08:43:16AM +0200, Michael Chinen wrote:
>> Hi,
>>
>> On Sun, Aug 15, 2010 at 2:32 AM, Michael Chinen <mchinen at gmail.com> wrote:
>> [...]
>> >> With these 3 plus FLAC parser patches, some formats should provide
>> >> sample-accurate seeking, right? Which are those? What would have to be
>> >> done to, say, the (ASF demuxer +) WMA decoder (or any other audio
>> >> format) to make it sample-accurate?
>> >
>> > ogg, FLAC, mp3 are the audio formats provide sample accurate seeking.
>> > asf doesn't work well yet, and I'm looking into it.
>>
>> Okay, this updated patch plus this 0004 patch for asfdec.c which
>> depends on 0001 (but is independent of 0002 and 0003) will give
>> support for asfdec.c and allow formats without parsers to work by
>> introducing a new virtual function into AVInputFormat for flushing.
>>
>> There are also some bugfixes detailed in the patch log.
>>
>> I wasn't sure how to send 0004 since it would change the email subject
>> and break the thread.
>> Please let me know if there is a more preferred way.
>>
>> Michael
>
>> ?avformat.h | ? 34 +++++
>> ?utils.c ? ?| ?407 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
>> ?2 files changed, 424 insertions(+), 17 deletions(-)
>> 85e730e8c6800c0bf1996310942d5b8405242666 ?0001-Add-av_build_index.patch
>> From 91192c8e856a7baa64cae45d725d3e5d8f8b02d2 Mon Sep 17 00:00:00 2001
>> From: Michael Chinen <mchinen at gmail.com>
>> Date: Fri, 13 Aug 2010 05:43:02 +0200
>> Subject: [PATCH 1/4] Add av_build_index
>> ?Also adds static av_seek_frame_table function and integration into existing seek api
>>
>> Second patch submission touchups:
>> make sure first timestamp is set
>> build index correctly for formats that don't have parsers or AVFMT_GENERIC_INDEX
>> flush and update timestamp for non-parallel building.
>> Add read_flush to AVInputFormat
>> Provides av_build_index support for AVInputFormats without parsers that hold state
>> ---
>> ?libavformat/avformat.h | ? 34 ++++
>> ?libavformat/utils.c ? ?| ?407 ++++++++++++++++++++++++++++++++++++++++++++++--
>> ?2 files changed, 424 insertions(+), 17 deletions(-)
>>
>> diff --git a/libavformat/avformat.h b/libavformat/avformat.h
>> index 452aea6..13184c9 100644
>> --- a/libavformat/avformat.h
>> +++ b/libavformat/avformat.h
>> @@ -408,6 +408,11 @@ typedef struct AVInputFormat {
>>
>> ? ? ?const AVMetadataConv *metadata_conv;
>>
>> + ? ?/**
>> + ? ? * Reset the format's state to one that does not rely on previous input.
>> + ? ? */
>> + ? ?int (*read_flush)(struct AVFormatContext *s);
>
> iam not sure about this one, can we postpone this hunk please to later

Ok, moved to 0006.2 patch

>
>
>> +
>> ? ? ?/* private fields */
>> ? ? ?struct AVInputFormat *next;
>> ?} AVInputFormat;
>> @@ -586,6 +591,19 @@ typedef struct AVStream {
>> ? ? ? * Number of frames that have been demuxed during av_find_stream_info()
>> ? ? ? */
>> ? ? ?int codec_info_nb_frames;
>> +
>> + ? ?/* new av_seek_frame_table() support */
>> +#define AV_SEEKTABLE_BUILDING ? 0x0001 /**< if set the index is being built ? */
>
>> +#define AV_SEEKTABLE_CBR ? ? ? ?0x0002 /**< if set the file is constant bit
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?rate and the index is implicit ? ?*/
>
> the doxy is a bit terse

Improved it.

>
>
> [...]
>> @@ -1337,8 +1338,28 @@ void ff_reduce_index(AVFormatContext *s, int stream_index)
>>
>> ? ? ?if((unsigned)st->nb_index_entries >= max_entries){
>> ? ? ? ? ?int i;
>> - ? ? ? ?for(i=0; 2*i<st->nb_index_entries; i++)
>> - ? ? ? ? ? ?st->index_entries[i]= st->index_entries[2*i];
>> + ? ? ? ?if(st->seek_table_flags & AV_SEEKTABLE_BUILDING) {
>> + ? ? ? ? ? ?int j ? ? ? ? ? ?= 0;
>> + ? ? ? ? ? ?int nb_reduced ? = max_entries / 2;
>> + ? ? ? ? ? ?int64_t ts_inc ? = (st->index_entries[st->nb_index_entries - 1].timestamp
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- st->index_entries[0].timestamp) / nb_reduced;
>> + ? ? ? ? ? ?int64_t start_ts = st->index_entries[0].timestamp;
>> + ? ? ? ? ? ?/* reinsert for even distribution of timestamps */
>> + ? ? ? ? ? ?for (i = 0; i < nb_reduced; i++) {
>> + ? ? ? ? ? ? ? ?if (j >= st->nb_index_entries)
>> + ? ? ? ? ? ? ? ? ? ?break;
>> + ? ? ? ? ? ? ? ?st->index_entries[i] = st->index_entries[j++];
>> + ? ? ? ? ? ? ? ?while (j < st->nb_index_entries &&
>> + ? ? ? ? ? ? ? ? ? ? ? (st->index_entries[j].timestamp <
>> + ? ? ? ? ? ? ? ? ? ? ? ?start_ts + (i + 1) * ts_inc)) {
>> + ? ? ? ? ? ? ? ? ? ?j++;
>> + ? ? ? ? ? ? ? ?}
>> + ? ? ? ? ? ?}
>> + ? ? ? ? ? ?av_log(s,AV_LOG_DEBUG,"reduced index to size %d (max = %d)\n", i, max_entries);
>> + ? ? ? ?} else {
>> + ? ? ? ? ? ?for (i = 0; 2 * i < st->nb_index_entries; i++)
>> + ? ? ? ? ? ? ? ?st->index_entries[i] = st->index_entries[2*i];
>> + ? ? ? ?}
>
> this is reimplementing existing code in a different way.
> you can use the existing code or improve the existing code and use it
> but you cant duplicate it.

I removed the original method and replaced it with mine.  I don't
think there are cases where we need an uneven index.  I tested this on
5 hour mp3s/ogg and it handles okay (seek resolution is about 0.5s)

>
>
>> ? ? ? ? ?st->nb_index_entries= i;
>> ? ? ?}
>> ?}
>> @@ -1349,6 +1370,10 @@ int av_add_index_entry(AVStream *st,
>> ? ? ?AVIndexEntry *entries, *ie;
>> ? ? ?int index;
>>
>> + ? ?/* Don't add index if we already have a complete one */
>> + ? ?if (st->seek_table_flags & AV_SEEKTABLE_FINISHED)
>> + ? ? ? ?return av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);
>> +
>> ? ? ?if((unsigned)st->nb_index_entries + 1 >= UINT_MAX / sizeof(AVIndexEntry))
>> ? ? ? ? ?return -1;
>>
>
> why is this hunk needed?

I thought we were supposed to return valid index (as the function does
at end,) but I grepped and no one seems to use the return value.  I am
or'ing the check with the next if instead.

>
>
>> @@ -1361,21 +1386,31 @@ int av_add_index_entry(AVStream *st,
>>
>> ? ? ?st->index_entries= entries;
>>
>> - ? ?index= av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);
>> + ? ?/* If we are building the table, the indicies added in order, so we don't
>> + ? ? ? have to do expensive searching. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?*/
>> + ? ?if (st->seek_table_flags & AV_SEEKTABLE_BUILDING) {
>> + ? ? ? ?index = st->nb_index_entries++;
>> + ? ? ? ?ie = &st->index_entries[index];
>> + ? ?} else {
>> + ? ? ? ?index = av_index_search_timestamp(st, timestamp, AVSEEK_FLAG_ANY);
>
> improve av_index_search_timestamp() if there is a problem

 Modifying av_index_search_timestamp would require introduction of a
new flag (since at least the avi decoder uses
av_index_search_timestamp in their read_packet).  So I went with an
alternate solution does not modify av_index_search_timestamp but
eliminates code duplication.  It simply sets index to -1 if the table
is building.


>
>
>>
>> - ? ?if(index<0){
>> - ? ? ? ?index= st->nb_index_entries++;
>> - ? ? ? ?ie= &entries[index];
>> - ? ? ? ?assert(index==0 || ie[-1].timestamp < timestamp);
>> - ? ?}else{
>> - ? ? ? ?ie= &entries[index];
>> - ? ? ? ?if(ie->timestamp != timestamp){
>> - ? ? ? ? ? ?if(ie->timestamp <= timestamp)
>> - ? ? ? ? ? ? ? ?return -1;
>> - ? ? ? ? ? ?memmove(entries + index + 1, entries + index, sizeof(AVIndexEntry)*(st->nb_index_entries - index));
>> - ? ? ? ? ? ?st->nb_index_entries++;
>> - ? ? ? ?}else if(ie->pos == pos && distance < ie->min_distance) //do not reduce the distance
>> - ? ? ? ? ? ?distance= ie->min_distance;
>> + ? ? ? ?if (index < 0){
>> + ? ? ? ? ? ?index = st->nb_index_entries++;
>> + ? ? ? ? ? ?ie = &entries[index];
>> + ? ? ? ? ? ?assert(index == 0 || ie[-1].timestamp < timestamp);
>> + ? ? ? ?} else {
>> + ? ? ? ? ? ?ie= &entries[index];
>> + ? ? ? ? ? ?if (ie->timestamp != timestamp){
>> + ? ? ? ? ? ? ? ?if (ie->timestamp <= timestamp)
>
> reindention

These changes are gone due to the above hunk change.

>
>
> [...]
>
>> + ? ? ? ?av_log(s, AV_LOG_DEBUG,
>> + ? ? ? ? ? ? ? "SEEK TABLE: copying over parallel frames\n");
>> + ? ? ? ?for(i = 0; i < s->nb_streams; i++) {
>> + ? ? ? ? ? ?st = s->streams[i];
>> + ? ? ? ? ? ?av_free(st->index_entries);
>> + ? ? ? ? ? ?st->index_entries ? ? ? ? ? ? ? ?= st->parallel_index_entries;
>> + ? ? ? ? ? ?st->nb_index_entries ? ? ? ? ? ? = st->parallel_nb_index_entries;
>> + ? ? ? ? ? ?st->index_entries_allocated_size = st->parallel_index_entries_allocated_size;
>> + ? ? ? ? ? ?st->parallel_index_entries ? ? ? = NULL;
>> + ? ? ? ? ? ?st->seek_table_flags |= AV_SEEKTABLE_FINISHED;
>> + ? ? ? ?}
>> + ? ?}
>> +
>
>> + ? ?st = s->streams[stream_index];
>> + ? ?if (st->seek_table_flags & AV_SEEKTABLE_FINISHED)
>> + ? ? ? ?return av_seek_frame_table(s, stream_index, timestamp, flags);
>
> why do we need this?
> once the index is complete the existing seek functions should use the index

The table_seek also does CBR, but I moved this to av_seek_frame_cbr
and merged the leftover into av_seek_frame_generic.

>
>
>> +
>> ? ? ?/* first, we try the format specific seek */
>> ? ? ?if (s->iformat->read_seek)
>> ? ? ? ? ?ret = s->iformat->read_seek(s, stream_index, timestamp, flags);
>> @@ -3686,3 +3813,249 @@ int ff_write_chained(AVFormatContext *dst, int dst_stream, AVPacket *pkt,
>> ? ? ?return av_write_frame(dst, &local_pkt);
>> ?}
>>
>> +#define FF_DETECT_CBR_FPS_GUESS 10 /**< fps to assume if unknown */
>
> this doesnt look very nice

Expanded the comment; maybe this helps.

>
>
>> +static int av_detect_cbr(AVFormatContext *s, AVStream *st) {
>> + ? ?//check the stream's codec to see if a number of frames are listed,
>> + ? ?//but we can't assume that it is available.
>> + ? ?if (s->data_offset > 0 && s->file_size > 0 &&
>> + ? ? ? ?s->data_offset != s->file_size && st->codec && st->duration > 0) {
>> + ? ? ? ?double sec_duration = st->duration * av_q2d(st->time_base);
>
> floats should be avoided to have exact and reproduceable across platform
> behavior.

done.

>
>
> [...]
>
>> + ? ? ? ?/* Delete old index entries to get a clean table. */
>> + ? ? ? ?for(i = 0; i < build_ic->nb_streams; i++) {
>> + ? ? ? ? ? ?int64_t start_ts;
>> + ? ? ? ? ? ?build_ic->streams[i]->seek_table_flags |= AV_SEEKTABLE_BUILDING;
>> + ? ? ? ? ? ?build_ic->streams[i]->nb_index_entries = 0;
>> + ? ? ? ? ? ?AVStream* seek_st;
>
> mixes declarations and statements
> also please try tools/patcheck

Fixed.

>
>
>> + ? ? ? ? ? ?seek_st = build_ic->streams[i];
>> + ? ? ? ? ? ?start_ts = seek_st->start_time != AV_NOPTS_VALUE ?
>> + ? ? ? ? ? ? ? ? ? ? ? seek_st->start_time :
>> + ? ? ? ? ? ? ? ? ? ? ?(seek_st->first_dts ?!= AV_NOPTS_VALUE ?
>> + ? ? ? ? ? ? ? ? ? ? ? seek_st->first_dts ?: 0);
>> +
>> + ? ? ? ? ? ?av_update_cur_dts(build_ic, build_ic->streams[i],
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?start_ts);
>> + ? ? ? ?}
>> +
>> + ? ? ? ?orig_flags = build_ic->flags;
>> + ? ? ? ?build_ic->flags |= AVFMT_FLAG_GENPTS;
>> + ? ? ? ?for (;;) {
>> + ? ? ? ? ? ?do {
>> + ? ? ? ? ? ? ? ?ret = av_read_frame(build_ic, &pkt);
>> + ? ? ? ? ? ?} while (ret == AVERROR(EAGAIN));
>> + ? ? ? ? ? ?if (ret < 0)
>> + ? ? ? ? ? ? ? ?break;
>> + ? ? ? ? ? ?av_free_packet(&pkt);
>> + ? ? ? ?}
>
> duplicates code from av_seek_frame_generic()

The contents of the for loop are different and it involves a break, so
I don't see how to refactor this besides moving the three lines of
code that make the do-while loop up.  Is this what you mean?

>
>
>> + ? ? ? ?build_ic->flags = orig_flags;
>> + ? ? ? ?for(i = 0; i < build_ic->nb_streams; i++)
>> + ? ? ? ? ? ?build_ic->streams[i]->seek_table_flags ^= AV_SEEKTABLE_BUILDING;
>
> i think |= and &=~ are more clear in terms of readability

Done.

>
>
>> +
>> + ? ? ? ?ret = 0;
>> + ? ?cleanup:
>> + ? ? ? ?/* If built on a parallel context, mark the original context. */
>> + ? ? ? ?if (flags & AV_BUILD_INDEX_PARALLEL) {
>> + ? ? ? ? ? ?if (build_ic) {
>> + ? ? ? ? ? ? ? ?for (i = 0; i < build_ic->nb_streams; i++) {
>> + ? ? ? ? ? ? ? ? ? ?if (ret >= 0) {
>> + ? ? ? ? ? ? ? ? ? ? ? ?av_log(s, AV_LOG_DEBUG,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "SEEK TABLE: marking %i frames from"
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? " parallel stream ready for copy\n",
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? build_ic->streams[i]->nb_index_entries);
>> + ? ? ? ? ? ? ? ? ? ? ? ?s->streams[i]->parallel_index_entries =
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?build_ic->streams[i]->index_entries;
>> + ? ? ? ? ? ? ? ? ? ? ? ?s->streams[i]->parallel_nb_index_entries =
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?build_ic->streams[i]->nb_index_entries;
>> + ? ? ? ? ? ? ? ? ? ? ? ?s->streams[i]->parallel_index_entries_allocated_size =
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?build_ic->streams[i]->index_entries_allocated_size;
>
> i think this should ignore the 80 char limit as it would be more readable then

Done.

>
>
>> + ? ? ? ? ? ? ? ? ? ? ? ?build_ic->streams[i]->index_entries = NULL;
>> + ? ? ? ? ? ? ? ? ? ?}
>> + ? ? ? ? ? ? ? ? ? ?avcodec_close(build_ic->streams[i]->codec);
>> + ? ? ? ? ? ? ? ?}
>
>> + ? ? ? ? ? ? ? ?s->flags |= AVFMT_FLAG_TABLE_READY;
>
> this is possibly not thread safe or at least asking for future bugs
> a variable should not be written to by more than 1 thread without
> locking. |= is not guranteed to be an atomic operation,
> bits are not independent variables

Yes, sorry about this.  Now using independent variable (added to
AVFormatContext) set from building thread and do not unset it in main
thread.  Main thread now sets AVFormatContext flag
AVFMT_FLAG_INDEX_BUILT instead of unsetting.

>
>
>> + ? ? ? ? ? ? ? ?av_close_input_file(build_ic);
>> + ? ? ? ? ? ?}
>> + ? ? ? ?}
>> + ? ? ? ?if (ret < 0)
>> + ? ? ? ? ? ?return ret;
>> + ? ?}
>> +
>
>> + ? ?/* Since we probably have moved the read cursor to the end,
>> + ? ? ? return seek to start of stream for non-parallel clients.
>> + ? ? ? Not sure if this the desired behavior, but it seems convinient. */
>> + ? ?if (!(flags & AV_BUILD_INDEX_PARALLEL) &&
>> + ? ? ? ?!(st->seek_table_flags & AV_SEEKTABLE_COPIED)) {
>> + ? ? ? ?int64_t start_ts;
>> +
>> + ? ? ? ?ff_read_frame_flush(s);
>> + ? ? ? ?for(i = 0; i < s->nb_streams; i++)
>> + ? ? ? ? ? ?if (s->streams[i]->nb_index_entries)
>> + ? ? ? ? ? ? ? ?s->streams[i]->seek_table_flags |= AV_SEEKTABLE_FINISHED;
>> +
>> + ? ? ? ?start_ts = st->start_time != AV_NOPTS_VALUE ? st->start_time :
>> + ? ? ? ? ? ? ? ? ?(st->first_dts ?!= AV_NOPTS_VALUE ? st->first_dts ?: 0);
>> + ? ? ? ?if ((ret = av_seek_frame(s, stream_index,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? start_ts, AVSEEK_FLAG_ANY)) < 0) {
>> + ? ? ? ? ? ?int data_offset = (s->data_offset < 0) ||
>> + ? ? ? ? ? ? ? ?(s->data_offset >= s->file_size) ? 0 : s->data_offset;
>> +
>> + ? ? ? ? ? ?av_log(s, AV_LOG_DEBUG,
>> + ? ? ? ? ? ? ? ? ? "SEEK TABLE: error seeking using new index: %i,"
>> + ? ? ? ? ? ? ? ? ? " trying url_fseek\n",
>> + ? ? ? ? ? ? ? ? ? ret);
>> +
>> + ? ? ? ? ? ?/* last ditch effort to seek using the file pointer. */
>> + ? ? ? ? ? ?if ((ret = url_fseek(s->pb, data_offset, SEEK_SET)) < 0) {
>> + ? ? ? ? ? ? ? ?av_log(s,AV_LOG_DEBUG,
>> + ? ? ? ? ? ? ? ? ? ? ? "SEEK TABLE: error seeking with url_fseek: %i\n",
>> + ? ? ? ? ? ? ? ? ? ? ? ret);
>> + ? ? ? ? ? ? ? ?return ret;
>> + ? ? ? ? ? ?}
>> + ? ? ? ?}
>> + ? ?}
>
> is all this needed?
> or does a subset of this work as well?

This was debug junk from when it didn't work as well.  Reduced to a subset.

Thanks for the review.

Michael

[...]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-av_build_index.patch
Type: application/octet-stream
Size: 21067 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100819/bc679f7f/attachment.obj>



More information about the ffmpeg-devel mailing list