[FFmpeg-devel] [PATCH] Incorrect Ogg Theora duration

Måns Rullgård mans
Sat Apr 26 13:52:27 CEST 2008


Henrik Gulbrandsen <henrik at gulbra.net> writes:

> On Tue, 2008-04-22 at 19:56 +0100, M?ns Rullg?rd wrote:
>> Henrik Gulbrandsen <henrik at gulbra.net> writes:
>> 
>> > On Sun, 2008-04-20 at 13:28 +0100, M?ns Rullg?rd wrote:
>> >> Henrik Gulbrandsen <henrik at gulbra.net> writes:
>> >> 
>> >> > On Fri, 2008-04-18 at 00:45 -0700, Baptiste Coudurier wrote: 
>> >> >> I think the "start" value is incorrect. According to specs, ogg
>> >> >> granule stores the pts plus the duration of the ogg packet, so last
>> >> >> packet granule is the stream duration.
>> >> >
>> >> > As mentioned previously, Ogg Theora bitstreams prior to version 3.2.1
>> >> > don't follow this rule. Since libtheora uses 3.2.0 for versions before
>> >> > 1.0beta1, I think there is a good case for being backwards compatible.
>> >> 
>> >> How many such files exist in the wild?  Given that I've never seen a
>> >> wild Theora file, other than Xiph promotional stuff, I reckon there
>> >> can't be that many.
>> >
>> > Well, more to the point: My test system happens to have libtheora alpha7
>> 
>> That's a very bad argument.
>
> It's a statement of the facts. What is there to argue about? Surely,
> you're not suggesting that the bug should be left unfixed just because
> it's minor and only occurs under special circumstances?
>
> [...]
>
>> >> I still don't understand.  Why do you seek back in the file after
>> >> reading the headers?  It doesn't make any sense to me.
>> >
>> > Have a look at ogg_get_headers()! We're looping over an unknown number
>> > of header packets for each stream and must hand them over to the codec
>> > to figure out if they are actually headers or not. This means that we
>> > are not only reading the headers, but also the first data packet, which
>> > is currently discarded. I just want it back in stream for later reading.
>> 
>> Read the code again.  The packet stays in the buffer and is returned
>> by the next call to ogg_packet().
>
> I must suffer from brain atrophy! Thanks! In that case, the core issue
> is that the 5814 check-in didn't take this into account. It assumes that
> the next page read will belong to the first data packet. Or at least, I
> think that's the assumption. It looks like whoever wrote this was happy
> with a quick approximation anyway. I was hoping for a minimal update to
> the code, but... I hope the attached version is a reasonable approach.
>
> /Henrik
>
>
> Index: libavformat/oggparsetheora.c
> ===================================================================
> --- libavformat/oggparsetheora.c	(revision 12931)
> +++ libavformat/oggparsetheora.c	(working copy)
> @@ -29,6 +29,7 @@
>  #include "oggdec.h"
>  
>  typedef struct theora_params {
> +    unsigned version;
>      int gpshift;
>      int gpmask;
>  } theora_params_t;
> @@ -95,6 +96,7 @@ theora_header (AVFormatContext * s, int 
>          if (version >= 0x304000)
>              skip_bits(&gb, 2);
>  
> +        thp->version = version;
>          thp->gpshift = get_bits(&gb, 5);
>          thp->gpmask = (1 << thp->gpshift) - 1;
>  
> @@ -124,6 +126,11 @@ theora_gptopts(AVFormatContext *ctx, int
>      uint64_t iframe = gp >> thp->gpshift;
>      uint64_t pframe = gp & thp->gpmask;
>  
> +    // We want the PTS of the next display interval
> +    // rather than the beginning of this frame...
> +    if (thp->version < 0x030201)
> +        iframe++;
> +
>      if(!pframe)
>          os->pflags |= PKT_FLAG_KEY;
>  

Mostly OK.

> Index: libavformat/oggdec.c
> ===================================================================
> --- libavformat/oggdec.c	(revision 12969)
> +++ libavformat/oggdec.c	(working copy)
> @@ -468,15 +468,34 @@ ogg_get_length (AVFormatContext * s)
>  
>      ogg->size = size;
>      ogg_restore (s, 0);
> +    if (idx == -1)
> +        return 0;
> +
>      ogg_save (s);
> +
> +    // Read the first stream packet if not already done
> +    if (idx != ogg->curidx) {
> +        while (!ogg_read_page (s, &i)) {
> +            if (i == idx && ogg->streams[i].granule != -1)
> +                break;
> +        }
> +    }
> +
> +    // Read the second stream packet
>      while (!ogg_read_page (s, &i)) {
>          if (i == idx && ogg->streams[i].granule != -1 && ogg->streams[i].granule != 0)
>              break;
>      }
> +
>      if (i == idx) {
> -        s->streams[idx]->start_time = ogg_gptopts (s, idx, ogg->streams[idx].granule);
> +        // Calulate start time assuming same duration for the two packets

That's one hell of an assumption.

> +        ogg_stream_t *os = ogg->streams + idx;
> +        int64_t pkt2_pts = ogg_gptopts(s, idx, os->lastgp);
> +        int64_t duration = ogg_gptopts(s, idx, os->granule) - pkt2_pts;
> +        s->streams[idx]->start_time = FFMAX(0, pkt2_pts - duration);
>          s->streams[idx]->duration -= s->streams[idx]->start_time;
>      }
> +
>      ogg_restore (s, 0);
>  
>      return 0;

The Ogg spec actually mandates a start time of zero for all streams:

  The position specified is the total samples encoded after including
  all packets finished on this page (packets begun on this page but
  continuing on to the next page do not count). The rationale here is
  that the position specified in the frame header of the last page
  tells how long the data coded by the bitstream is. A truncated
  stream will still return the proper number of samples that can be
  decoded fully.

-- 
M?ns Rullg?rd
mans at mansr.com




More information about the ffmpeg-devel mailing list