[FFmpeg-devel] [PATCH] H.264/AVCHD interlaced fixes

Ivan Schreter schreter
Sun Feb 8 15:20:14 CET 2009


Hi,

Michael Niedermayer wrote:
> On Sun, Feb 08, 2009 at 10:07:23AM +0100, Ivan Schreter wrote:
>   
>>> [...]  
>>>       
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>>  int ff_h264_decode_sei(H264Context *h){
>>>>      MpegEncContext * const s = &h->s;
>>>>  @@ -6873,6 +6873,10 @@
>>>>              if(decode_unregistered_user_data(h, size) < 0)
>>>>                  return -1;
>>>>              break;
>>>> +        case SEI_TYPE_RECOVERY_POINT:
>>>> +            if(decode_recovery_point(h) < 0)
>>>> +                return -1;
>>>> +            break;
>>>>          default:
>>>>              skip_bits(&s->gb, 8*size);
>>>>          }
>>>> @@ -7340,6 +7344,8 @@
>>>>      int context_count = 0;
>>>>       h->max_contexts = avctx->thread_count;
>>>> +    h->sei_recovery_frame_cnt = -1;
>>>> +
>>>>  #if 0
>>>>      int i;
>>>>      for(i=0; i<50; i++){
>>>> @@ -7676,6 +7682,14 @@
>>>>          } else {
>>>>              cur->repeat_pict = 0;
>>>>
>>>> +            /*
>>>> +             * TODO: Currently, we only communicate the fact we have a 
>>>> key
>>>> +             * frame, but there is no possibility to communicate 
>>>> recovery
>>>> +             * frame count. Most probably, recovery frame count is 
>>>> anyway 0.
>>>> +             * Add this in the future.
>>>> +             */
>>>> +            cur->key_frame = (h->sei_recovery_frame_cnt >= 0);
>>>>     
>>>>         
>>> this could set key_frame == 0 for IDR, which is wrong
>>>   
>>>       
>> Correct, my mistake. This should fix it:
>>
>> @@ -7422,6 +7445,7 @@
>>                 return -1;
>>             }
>>             idr(h); //FIXME ensure we don't loose some frames if there is 
>> reordering
>> +            h->sei_recovery_frame_cnt = 0;
>>         case NAL_SLICE:
>>             init_get_bits(&hx->s.gb, ptr, bit_length);
>>             hx->intra_gb_ptr=
>>
>> OK now?
>>     
>
> now its a hack
> i mean idr != sei_recovery_frame_cnt
>
>   
True, but IDR == key frame, so implicitly it has recovery frame count == 
0. So this will correctly detect it. How else would you do it?

h->sei_recovery_frame_cnt >= 0 iff it's a key frame. It's a keyframe, 
iff h->sei_recovery_frame_cnt >= 0. The only thing which might 
theoretically break it is a SEI recovery point associated with IDR frame 
having recovery_frame_cnt > 0. But this is not logical, since IDR frame 
is the begin of coding sequence, so it must be decodable without 
depending on previous frames. Therefore, I think it's OK so and not a hack.

>>> also it does not set the key_frame flag if there is no decode but just a
>>> AVParser
>>>   
>>>       
>> I don't quite understand what you mean. Parser is addressed by patch #6, 
>> which handles both field combining and key frame flag. It was very hard to 
>> split it in two patches, therefore I kept it in one.
>>     
>
> i didnt review #6 as it was big & scary IIRC
>   
Umm... It is a little bigger, but not _that_ big. And it's pretty 
straightforward - after finding a complete packet, this packet is parsed 
via parse_nal_units(), which is nothing else but the code I got from 
you, corrected and extended by a few things (namely reading field type 
and frame number). In parser itself, it then looks if it's an unpaired 
field picture, and if so, it will not return the packet yet, but note 
down the position in buffer and relevant picture type info, wait for 
next packet and pair it with it. As error handling, if an unpaired field 
comes, it will be discarded, since it can't be combined anyway into a 
full frame.

I have commented the code IMHO quite extensively, so it should be easy 
to understand.

> [...]
>   
>>>> Index: libavcodec/h264.c
>>>> ===================================================================
>>>> --- libavcodec/h264.c	(revision 17030)
>>>> +++ libavcodec/h264.c	(working copy)
>>>> @@ -7340,6 +7344,19 @@
>>>>      H264Context *hx; ///< thread context
>>>>      int context_count = 0;
>>>>  +    /*
>>>> +     * Interlaced H.264 stream can be encoded as field pictures (i.e., 
>>>> two
>>>> +     * pictures per frame). If the input buffer contains both, then they 
>>>> are
>>>> +     * delimited by access unit delimiter. These flags allow us to 
>>>> decode only
>>>>     
>>>>         
>>> I dont see anything in the H.264 spec that says that AU delimiters have to 
>>> be
>>> used in this case.
>>>   
>>>       
>> OK, maybe I put it a bit wrong. The decoder itself will work, but the 
>> bookkeeping for a decoded picture done in decode_frame() after 
>> decode_nal_units() (stuff like MPV_frame_end() and/or ff_er_frame_end()) 
>> doesn't get executed if decode_nal_units() doesn't return after an NAL_AUD.
>>
>> Alternatively, one could move this into decode_nal_units() and do it as 
>> response to NAL_AUD. I just wanted to minimize changes there and I prefer 
>> to leave it as-is instead of stuffing unrelated bookkeeping code into 
>> decode_nal_units() (which should IMHO decode exactly one access unit). OK?
>>     
>
> simple thing
> 1. which part of h264 says that AUD is mandatory?
> 2. if none then you cant depend on AUD
>   
True. According to 7.4.1.2.3, NAL_AUD is optional. I'll rework it to 
work also without NAL_AUD and post it later today.

Regards,

Ivan






More information about the ffmpeg-devel mailing list