[FFmpeg-devel] [PATCH] libavcodec/videotoolbox: fix decoding of h264 streams with minor SPS changes

Aman Gupta ffmpeg at tmm1.net
Sat Nov 18 01:32:03 EET 2017


On Fri, Nov 17, 2017 at 3:14 PM Hendrik Leppkes <h.leppkes at gmail.com> wrote:

> On Fri, Nov 17, 2017 at 11:44 PM, Aman Gupta <ffmpeg at tmm1.net> wrote:
> > On Wed, Nov 15, 2017 at 1:57 PM, Hendrik Leppkes <h.leppkes at gmail.com>
> > wrote:
> >
> >> On Wed, Nov 15, 2017 at 10:15 PM, Aman Gupta <ffmpeg at tmm1.net> wrote:
> >> > From: Aman Gupta <aman at tmm1.net>
> >> >
> >> > Previously the codec kept an entire copy of the SPS, and restarted the
> >> VT decoder
> >> > session whenever it changed. This fixed decoding errors in [1], as
> >> > described in 9519983c. On further inspection, that sample features an
> >> SPS change
> >> > from High/4.0 to High/3.2 while moving from one scene to another.
> >> >
> >> > Yesterday I received [2], which contains minor SPS changes where the
> >> > profile and level do not change. These occur frequently and are not
> >> associated with
> >> > scene changes. After 9519983c, the VT decoder session is recreated
> >> unnecessarily when
> >> > these are encountered causing visual glitches.
> >> >
> >> > This commit simplifies the state kept in the VTContext to include just
> >> the first three
> >> > bytes of the SPS, containing the profile and level details. This is
> >> populated initially
> >> > when the VT decoder session is created, and used to detect changes and
> >> force a restart.
> >> >
> >> > This means minor SPS changes are fed directly into the existing
> decoder,
> >> whereas
> >> > profile/level changes force the decoder session to be recreated with
> the
> >> new parameters.
> >> >
> >>
> >> The profile and level are not exactly the only things that can change
> >> to force a decoder to be re-created.
> >> How about the frame dimensions, within the same level?
> >>
> >
> > I compared the different SPS present in the spschange2.ts sample, and
> here
> > is a diff:
> >
> >  ======= SPS =======
> >   profile_idc : 100
> >   constraint_set0_flag : 0
> >   constraint_set1_flag : 0
> >   constraint_set2_flag : 0
> >   constraint_set3_flag : 0
> >   constraint_set4_flag : 0
> >   constraint_set5_flag : 0
> >   reserved_zero_2bits : 0
> >   level_idc : 40
> >   seq_parameter_set_id : 0
> >   chroma_format_idc : 1
> >   residual_colour_transform_flag : 0
> >   bit_depth_luma_minus8 : 0
> >   bit_depth_chroma_minus8 : 0
> >   qpprime_y_zero_transform_bypass_flag : 0
> >   seq_scaling_matrix_present_flag : 1
> >   log2_max_frame_num_minus4 : 0
> > - pic_order_cnt_type : 1
> > + pic_order_cnt_type : 0
> > -   log2_max_pic_order_cnt_lsb_minus4 : 0
> > +   log2_max_pic_order_cnt_lsb_minus4 : 3
> >     delta_pic_order_always_zero_flag : 0
> >     offset_for_non_ref_pic : 0
> > -   offset_for_top_to_bottom_field : 7
> > +   offset_for_top_to_bottom_field : 0
> > -   num_ref_frames_in_pic_order_cnt_cycle : 7
> > +   num_ref_frames_in_pic_order_cnt_cycle : 0
> > - num_ref_frames : 3
> > + num_ref_frames : 0
> > - gaps_in_frame_num_value_allowed_flag : 0
> > + gaps_in_frame_num_value_allowed_flag : 1
> > - pic_width_in_mbs_minus1 : 13
> > + pic_width_in_mbs_minus1 : 81
> > - pic_height_in_map_units_minus1 : 2
> > + pic_height_in_map_units_minus1 : 38
> >   frame_mbs_only_flag : 1
> >   mb_adaptive_frame_field_flag : 0
> >   direct_8x8_inference_flag : 0
> >   frame_cropping_flag : 0
> >     frame_crop_left_offset : 0
> >     frame_crop_right_offset : 0
> >     frame_crop_top_offset : 0
> >     frame_crop_bottom_offset : 0
> > - vui_parameters_present_flag : 1
> > + vui_parameters_present_flag : 0
> >
> > Interestingly, the pic_height/pic_width do in fact change already in that
> > sample. But the correct thing to do, as far as decoding with
> VideoToolbox,
> > is to keep the same decompression session instance and pass the new SPS
> > NALU into the decoder along with the image slices.
> >
>
> Does it actually properly output images of the new size in that case?
>
> All I'm saying is that profile and level may not exactly be the
> parameters that actually require a re-creation. Profile maybe, but
> level unlikely. And there might as well be others.


> Is this not documented?


I understand what you're saying, but there is no documentation about this
from Apple so I can only deduce what is required empirically.

I already confirmed that level changes require a decoder restart, as seen
in the first sample (which goes from level 40 to 32). Without a restart the
VT decoder stalls and fails to produce any new frames.

Aman


>
> - Hendrik
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list