[FFmpeg-devel] [PATCH] AVCHD/H.264 parser: determination of frame type, question about timestamps

Ivan Schreter schreter
Sun Feb 1 01:17:24 CET 2009

Michael Niedermayer wrote:
> On Mon, Jan 26, 2009 at 08:42:17AM +0100, Ivan Schreter wrote:
> [...]
>> We have a stream with pictures containing (T1/B1/T2==T1), (B2/T3/B3==B2) 
>> fields. That's two H.264 pictures, but 3 frames. Each av_read_frame() 
>> should return a packte containing exactly single frame. But we have just 
>> 2 packets, which need to be returned in 3 calls to av_read_frame(), 
>> according to API. Further, the DTS must be set correctly as well for the 
>> three AVPackets in order to get the timing correct. How do you want to 
>> handle this?
> i dont see where you get 3 calls of av_read_frame(),
> there are 2 or 4 access units not 3 unless one is coded as 2 fields
> and 1 is a frame
No, we don't have 3 calls. First of all, I meant two pictures with 
generates 3 frames in the display. But the caller has to call 
av_read_frame() only twice, so he doesn't get all required timestamps. 
First decoded frame will have timestamp 1, second decoded frame 
timestamp 2 or 3, depending on how it's handled in H.264 decoder. One 
frame has to be added in between, with fields from both frames. This is 
currently not possible to express.

I suppose, there is the need to do something like repeat_pic on field 
level, but this means API change.

>> And as already mentioned, the case with (T1), (B1), (T2), (B2), we are 
>> returning 4 packets via av_read_frame() for 2 frames, which is against 
>> API. How to handle this? My idea was delaying return from h264_parse, 
>> until second field also parsed
> well, just consider the exampl that timestamps are always associated with
> the second field instead of the first. You couldnt associate them with the
> AVPackets
I don't believe someone would produce such streams. Anyway, the standard 
_requires_ DTS/PTS coding for all frames having DTS != PTS, so even in 
this case, I- and P-slices would have to have timestamps. The timestamps 
of B-slices in between can be computed.

My camcorder produces in this case probably a non-conformant stream. 
(T1) is an I-slice with DTS/PTS, (B1) is a P-slice referring to (T1) 
without DTS/PTS (but correctly, it has to have it, since they are 
definitely not equal), (T2), (B2) are B-slices for second frame without 
DTS/PTS (OK), and next pair of P-slices again has DTS/PTS for the top 
field, but not for the bottom one (which is IMHO non-conformant).



More information about the ffmpeg-devel mailing list