[FFmpeg-devel] [PATCH/RFC] H.264 FMO+ASO decoding

Stefan Gehrer stefan.gehrer
Tue Sep 28 22:04:14 CEST 2010


On 09/26/2010 04:43 PM, Michael Niedermayer wrote:
> On Wed, Jul 21, 2010 at 10:20:36PM +0200, Stefan Gehrer wrote:
>> Hi,
>>
>> attached is a patch that implements the header stuff for FMO decoding
>> in H.264 baseline streams. It decodes the slice group map and also
>> provides means that at the start of a slice the actual x/y position
>> of the first macroblock can be determined from first_mb_in_slice.
>> If this is done, slices can be decoded out of order, i.e.
>> first_mb_in_slice does not have to increase for slices of the same
>> picture (aka ASO).
>> Let's take as example a tiny video of 4x3 macroblocks which
>> has a slice group map with two slice groups like this:
>>
>> 1  1  1  1
>> 1  0  0  1
>> 1  1  1  1
>>
>> Each slice group itself (0 or 1) is decoded in raster-scan order,
>> so that the macroblock addresses are as follows:
>>
>> 2  3  4  5
>> 6  0  1  7
>> 8  9 10 11
>>
>> So if a slice comes along with first_mb_in_slice equal to 7 we need
>> to start decoding at MB position x=3 and y=1.
>>
>> Unfortunately, the real challenge starts here. A lot of neighbor
>> context handling (i4x4 modes, non-zero counts, MVs) has to be
>> handled differently when the assumption of raster-scan order of
>> macroblocks is not true anymore. Also, deblocking can only be done
>> after the picture has been fully decoded. This is because deblocking
>> goes across slice boundaries and slice group boundaries and the
>> neighbor MB might just happen to be the last to be decoded in the
>> picture.
>> Considering the heavy optimizations that have been done in the
>> normal decoding paths I guess there would be some outcry if in
>> many places in the MB decoding a conditional like
>> if(pps->slice_group_count>  1)
>> would appear.
>> So my feeling is that if FMO is to be implemented it may be best
>> to have a new code path for the slice data decoding, a
>> baseline-FMO-specific version of decode_slice() and some of its
>> subfunctions maybe?
>> Opinions welcome.
>
> my guess is you wont finish FMO

I guess you are right.

> also we need templating for it ...

are there any more thoughts on this, e.g. how many codepaths to compile?
MBAFF and non-MBAFF, CAVLC and CABAC?

> still parts of your patch could be usefull and move us a tiny step closer to
> FMO support
>
>
>>
>> Stefan
>
>>   h264.c    |   32 +++++++---
>>   h264.h    |    6 +
>>   h264_ps.c |  187 ++++++++++++++++++++++++++++++++++++++++++++++++++------------
>>   3 files changed, 182 insertions(+), 43 deletions(-)
>> 6228649db31722b1b6bbf9ff33a04c52d2baaf14  h264_fmo.diff
>> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
>> index d1662fc..bfb65d6 100644
>> --- a/libavcodec/h264.c
>> +++ b/libavcodec/h264.c
>> @@ -1968,8 +1968,6 @@ static int decode_slice_header(H264Context *h, H264Context *h0){
>>           av_log(h->s.avctx, AV_LOG_ERROR, "first_mb_in_slice overflow\n");
>>           return -1;
>>       }
>> -    s->resync_mb_x = s->mb_x = first_mb_in_slice % s->mb_width;
>> -    s->resync_mb_y = s->mb_y = (first_mb_in_slice / s->mb_width)<<  FIELD_OR_MBAFF_PICTURE;
>>       if (s->picture_structure == PICT_BOTTOM_FIELD)
>>           s->resync_mb_y = s->mb_y = s->mb_y + 1;
>>       assert(s->mb_y<  s->mb_height);
>
>> @@ -2153,11 +2151,18 @@ static int decode_slice_header(H264Context *h, H264Context *h0){
>>       }
>>       h->qp_thresh= 15 + 52 - FFMIN(h->slice_alpha_c0_offset, h->slice_beta_offset) - FFMAX3(0, h->pps.chroma_qp_index_offset[0], h->pps.chroma_qp_index_offset[1]);
>>
>> -#if 0 //FMO
>> -    if( h->pps.num_slice_groups>  1&&  h->pps.mb_slice_group_map_type>= 3&&  h->pps.mb_slice_group_map_type<= 5)
>> -        slice_group_change_cycle= get_bits(&s->gb, ?);
>> -#endif
>> +    if(h->pps.slice_group_count>  1){
>> +        int addr = -1;
>>
>> +        if(h->pps.mb_slice_group_map_type>= 3&&  h->pps.mb_slice_group_map_type<= 5)
>> +            ff_h264_draw_slice_group(h,&h->pps, s->mb_width, s->mb_height);
>> +        h->slice_group_current = 0;
>> +        for(j=0;j<=first_mb_in_slice;j++)
>> +            addr = ff_h264_fmo_next_mb(h, addr);
>
> this is too slow with many slices and groups

A way to make things faster is to convert the "slice group map" into a
lookup table to have a MbAddr -> x,y lookup.
But for some types of slice group maps the parameters are transmitted on
every slice (and changeable for every picture) and recalculating this
lookup table every time may be more costly than stepping through the
slice group map for every macroblock.
For some slice group types, it is possible to calculate this lookup
table directly without the intermediate "slice group map" as described
in the spec. But when one needs to cover all possible cases I think the
increase in code size would not justify the speed gained.
Anyway, thanks for taking the time to look at it.

Stefan




More information about the ffmpeg-devel mailing list