[Ffmpeg-devel] Problems with output picture reordering inH.264decoder

Santa Cruz, Diego GE Indust, Security diego.santacruz
Thu Aug 10 12:20:52 CEST 2006

> -----Original Message-----
> From: ffmpeg-devel-bounces at mplayerhq.hu 
> [mailto:ffmpeg-devel-bounces at mplayerhq.hu] On Behalf Of Loren Merritt
> Sent: Thursday, August 10, 2006 11:33
> To: FFMpeg development discussions and patches
> Subject: RE: [Ffmpeg-devel] Problems with output picture 
> reordering inH.264decoder
> On Thu, 10 Aug 2006, Santa Cruz, Diego (GE Indust, Security) wrote:
> > I guess the best solution would be to limit the amount of 
> reordering 
> > by using the num_reorder_frames if present and use the 
> > dpb_output_delay in the SEI picture timing to know when to 
> output each 
> > picture and the POC to actually reorder the pictures. I 
> think that is 
> > the way it is supposed to work in H.264. Without the 
> dpb_output_delay 
> > a decoder can only guarantee output order conformance but not 
> > *timing*, so it would not be possible to guarantee a "no jerk" 
> > operation if dpb_output_delay is not used.
> No SEIs are necessary. Picture timing SEIs are useful only if 
> you're storing them in an elementary stream or other 
> container that doesn't keep adequate timestamps; in any real 
> container they're just redundant. 
> (Well, picture timing SEI can also be used for soft-telecine, 
> but that's not really relevent to this discussion.)

Well the dpb_output_delay helps when the num_reorder_frames is missing
or has a value larger than the minimum required for correct decoding,
but if no encoder outputs them I guess it is of not much use :-(

> All that is needed to guarantee perfect "no jerk" operation 
> is to have a correct value of num_reorder_frames, and disable 
> ffh264's autodetection thereof.
> If you have that, then the correct algorithm is:
>    Keep a buffer of size=(num_reorder_frames+1).
>    Put each newly decoded frame into the buffer.
>    Whenever the buffer is full, remove the frame with the 
> lowest {idr_pic_id,poc}
>    (where idr_pic_id is a monotonically increasing value, not 
> literally
>     the variable idr_pic_id from the standard).
>    This gives you the pictures in display order, but with no 
> timestamps.
>    The demuxer should provide a PTS of each frame; use those 
> instead of
>    any timing SEI from the h264 stream.

Agree with that, as long as the encoder correctly sets
num_reorder_frames (which any reasonable encoder should do BTW).

> > BTW, if I understood well the reordering algorithm 
> currently in h264.c 
> > an IDR, P, B, B stream ... (with POCs 0, 6, 2, 4) that is decoded 
> > frame by frame will output the IDR right away and then output no 
> > picture (since it detects the POC gap from 0 to 6) and then 
> output the 
> > pictures normally. This already creates a jerk (although 
> small), but 
> > correct me if I misunderstood.
> Yes, that's what it does now if the stream does not specify 
> num_reorder_frames.
> If it does so with num_reorder_frames too, that would be a 
> bug in the implementation (failing to disable autodetection), 
> not a limitation of the algorithm.

I will test that as soon as I can and report if I find any problems.

> > Would relying on the dpb_output_delay be reasonable? Does 
> x264 output 
> > them?
> I don't remember ever seeing an h264 stream that had picture 
> timing SEIs. 
> Not that I would have noticed if I were just playing the 
> video, lavc silently drops SEIs it doesn't understand. But I 
> haven't found any in the streams I have analysed for debugging either.
> IMHO, it is reasonable to rely on num_reorder_frames if the 
> codec wants to do any funny reordering, i.e. anything other 
> than conventional B-frames.

I am not sure I correctly understand what you are saying. My
interpretation would be the following. If num_reorder_frames is set in
the SPS the amount of delay between input and output should be driven by
that (without any autodetection) and the actual reordering driven by the
POC. In cases where num_reorder_frames is not present the autodetection
should be what is most suitable for the streams currently out there, I'm
afraid I do not have much knowledge of other encoders to know what is
best here.

I will try to check the code to see if this is what is done and come
back if not (but I will be off for a week or so before that...)


Diego Santa Cruz, PhD
GE Security
Engineered Systems EMEA
Research Scientist

T +41 21 695 0019
F +41 21 695 0001
E diego.santacruz at ge.com

Victoria House
Route de la Pierre, 22
1024 Ecublens
VisioWave sarl 

More information about the ffmpeg-devel mailing list