[FFmpeg-devel] Remaining problems in H.264 handling
Sat Mar 28 23:22:06 CET 2009
Michael Niedermayer wrote:
> On Fri, Mar 27, 2009 at 07:56:59PM +0100, Ivan Schreter wrote:
>> IMHO, frame rate for H.264 (and probably also MPEG video) should be always
>> set to 1/2 tbc (and set to reliable, if timing info is provided), since
>> this maps the best to all possible combinations of picture structures. This
>> would also most probably make the first sample work without problem. But I
>> didn't do enough convinction work to convince Michael yet ;-).
> convince me that the frame rate is 1/2 tbc?
> if tbc is 1/90000 you want 45000 as framerate ?
> if tbc is 1/60 on a telecined video you want it to be 1/30?
> this is not about convincing its about starting out with ambgous terms
> and ending with nonsense
Yes, the terms were ambiguous, sorry for that. I just wanted to write
something quickly, not propose any solution yet, just state the fact
that I have a different opinion.
> 1. time base and frame rate are 2 seperate things.
> 2. there is no frame rate field, and i repeat like i did many times in the
> past that people CANNOT hijack a timebase or the r_frame_rate field
> and set them to the frame rate.
> if you want a frame rate field that has to be added as a new field.
I don't have sources on this computer and I don't remember exactly, so
I'm not going to write proper member names here.
However, AFAIK we have following: time base of the stream (90kHz for
MPEG-TS), which is rather uninteresting (it's just scaling constant),
timestamp rate (50Hz) and actual video full frame rate (25fps).
The problem: video full frame rate will be currently determined by
"unreliable" handling, where timestamps of *packets* are considered,
instead of *frames*, packet timestamps being expressed in 50Hz. Thus,
video consisting of field pictures (2 field pictures per video frame)
will get frame rate 50fps instead of correct 25fps. Video consisting of
frame pictures will get correct frame rate 25fps. Video consisting of
"frame doubling" pictures will get frame rate 12.5fps.
However, in all aforementioned cases, the correct video full frame rate
is actually 25fps (i.e., 1/2 of timestamp rate). Therefore I believe the
frame rate of H.264 video (and most probably also MPEG video) should be
computed as 1/2 of timestamp rate, if known, instead of using
You might object that this is incorrect, since with frame doubling, we
actually have 12.5fps. Yes, we do. But imagine a video starting with 200
frame doubling pictures, then the rest is normal frames. Well, the rest
is 25fps... So the whole stream should be treated as 25fps and not
12.5fps, doubling frames as asked by picture structure (we have
repeat_pict for that). Such switching of picture structure is completely
normal in TV broadcast.
Similarly, picture structure top-bottom-top and bottom-top-bottom
actually code three 25fps frames in two pictures, repeating half frames
only (and constructing the frame "in the middle" from the other two
frames). We cannot communicate this to the application yet, so it would
be handled like doubling each second frame.
The whole idea of all those adventurous picture structure types was
invented to represent other frame rates in fixed frame rate of
television display (telecine). So in my opinion, we should handle it in
such way as well - we have fixed frame rate of 1/2 timestamp rate and
picture structure just describes how to distribute frames/fields onto
full frames of this fixed frame rate.
I hope it makes some more sense now...
More information about the ffmpeg-devel