[FFmpeg-devel] [PATCH] H.264/AVCHD interlaced fixes

Ivan Schreter schreter
Tue Feb 10 15:29:51 CET 2009

Michael Niedermayer wrote:
> On Mon, Feb 09, 2009 at 09:39:27PM +0100, Ivan Schreter wrote:
>> Michael Niedermayer wrote:
>>> On Mon, Feb 09, 2009 at 01:01:53AM +0100, Ivan Schreter wrote:
>>>> [...]
>>>> In that case, however, each second field frame must get the same 
>>>> timestamp as first field frame it is paired with. I.e., lavf's utils.c 
>>>> must somehow know it is handling the second field of the same frame and 
>>>> assign same DTS/PTS as first field (or offset by 1/2 frame duration in 
>>>> case of interlaced video).
>>>> [...]
>>> we should support 2 fields in one buffer either way
>>> what i want is correct and full support of h264 in h222 timestamp
>>> interpolation
>> In that case, is the above the solution (at least for common case)? If 
>> yes, I'll send you a patch. I suppose it should be enough - DTS/PTS 
>> computation for full frames does work and even in field-picture stream, 
>> current formulas for first field of a frame still hold - problematic is 
>> just second frame. Right?
> as ive said already, you have a guranteed timestamps once every 0.7 seconds
> this can be always on the first field or always on the second or randomly
> the current code cannot interpolate timestamps for h264, building code on
> the assumtation that the current would work for the first field seems
> strange. But you can point me to the section of H222/264 that supports
> your theory about conseutive field timestamp relation.
I didn't find any section about PTS/DTS or for the sake of it about 
timestamps in general (except clock timestamp in SEI picture timing) in 
H.264 standard. Regarding field pictures in H.222.0 standard, a PTS is 
assigned to "presentation unit", which may be a frame or field picture. 
A note to field pictures says, though: "... for field pictures the 
presentation time refers to the first field picture of the coded frame". 
I infer, the timestamp for the second field picture is thus equal to the 
first field picture in same frame.

Did you find anything in standard which describes how PTS/DTS for field 
pictures is to be computed? I didn't and I searched quite hard.

Further, the timestamps for a frame/field without PTS/DTS must be 
inferrable from previous timestamps without knowing the future, right? 
So even if only a timestamp on the second field would be present (which 
according to the single note regarding field pictures doesn't seem to be 
the case), we'd have to be able to infer the timestamp of the first 
field preceding second field in the stream without knowing the timestamp 
of the second field (or have a crystal ball, were it not the case). So 
therefore, I'm saying, disregarding second fields completely, we could 
compute all timestamps for first fields, even if we only had a timestamp 
on the very very first (I-)field. Whether the current code can do even 
this correctly, is another point.

Timestamps in the stream once every 0.7 seconds are as far as I 
understand it for resync and error resiliency as well as for streaming, 
seeking, etc.

As for inferring the timestamp of the second field, when timestamp of 
the first field is known, there are only two possibilities: either 
PTS/DTS matches the first field (at least for progressive; also 
coincides with the note to field pictures in H.222.0 standard) or it is 
possibly offset by 1/2 frame (interlaced; but not found in any standard).

I'm even starting to believe that av_read_frame should return a DTS/PTS 
only after both field pictures for a frame have been read and return no 
timestamps otherwise.

But again, I'm thinking practical: A patch which addresses the problem 
95% (and surely doesn't break anything) is better than no solution at 
all. It can be later extended/corrected to handle it 100%, but I didn't 
see a stream yet, where there were a DTS/PTS on second field of a frame. 
If there is one, then it's always on the first field. So yes, there is a 
hypothetical case where it won't work and this can be addressed, when 
someone wants to decode such a stream => he can make a patch to add the 
functionality. That's evolutionary approach, which I'd prefer, rather 
than to spend hundreds of hours thinking about all possibilities 
(especially that my time is now extremely limited).

BTW, I also noticed that resulting picture returned from h264 
decode_frame doesn't have pts set (is 0). Other formats seem to return 
invalid values there as well (e.g., avi always increments for each 
frame, also after a seek). I take it, this is completely broken, right? 
So the application doesn't really have a chance to associate decoded 
picture with a PTS.



More information about the ffmpeg-devel mailing list