[FFmpeg-devel] [PATCH] H.264/AVCHD interlaced fixes

Sat Jan 31 16:30:01 CET 2009

Hi *,

here the patch to correct some problems when working with interlaced
H.264/AVCHD files. Two primary problems are addressed:
  - lack of key frame support in H.264 decoder (this is also relevant
    for progressive)
  - proper handling of field pictures for interlaced mode (two fields
    coded as separate pictures instead of one picture per frame)

Please review it and eventually apply it as-is. Note that I won't have
much time to fix anything in near future, as my first child is going to
be born in next couple of days. So it would be good to get through with
this fix before that happens :-)

How it works?

To support key frames, SEI message recovery point is decoded and recovery
frame count stored on the context. This is then used to set key frame flag
in decoding process. In the parser, it is used to communicate (synthetic)
picture type to libavformat's av_read_frame() routine to correctly set key
flag and compute missing PTS/DTS timestamps. This should be corrected
by extending appropriate context structure by adding key frame flag and
recovery frame count, but I didn't want to do this, since a) it would
potentially break binary compatiblility and b) one has to fix it in all
parsers.

To support interlaced mode, it was needed to collate two field pictures
into one buffer to fulfill av_read_frame() contract - i.e., reading one
whole frame per call. This is done in h264_parser.c by testing picture
structure and if a field image is detected, by waiting until the rest
of the frame (second field) comes in, before returning a buffer.

Additionally, in h264.c, in case of such two-picture buffers, current
decoding implementation would try to decode them in a row without
looking at delimiters. So decode_nal_units() now breaks the work when
NAL_AUD encountered and gives control back to the caller (decode_frame).
After decoding first field picture, decode_frame() looks, if there is
something more in the buffer (i.e., second field picture). This should
always be the case, since the parser delivers both field pictures.
Decoding process is restarted with second field frame.

There is one open point, though: Although it seems that top field pictures
are always preceding matching bottom field pictures, this is not fixed in
the standard. Current implementation relies on this. This is a workaround
for av_seek_frame() bug, which (at least for mpegts) repositions the
stream wildly, ignoring key frames. So potentially also in-between two
field pictures of one frame. This causes serious de-sync of the decoder.
As a workaround, bottom fields encountered without matching top field
before are currently ignored to re-sync the decoder.

There is a better workaround to this problem using frame number. Both
field pictures must namely share same frame number. I'll possibly implement
it later, as it makes sense not only as workaround for seeking bug, but
also as error resiliency feature for broken streams.

The bug in seeking in mpegts needs to be also addressed (separately).
This is then the last problem in order to use AVCHD camcorders with Linux
video editors like kdenlive. I'll look at it and write a separate post.

Regards,

Ivan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: h264_patch.diff
Type: text/x-patch
Size: 27337 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090131/f3efaeb3/attachment.bin>