[FFmpeg-devel] [PATCH 2/2] avcodec/videotoolbox: fix decoding of some h264 bitstreams

Mon Sep 25 04:31:03 EEST 2017

On Thu, Oct 1, 2015 at 9:54 AM, wm4 <nfxjfg at googlemail.com> wrote:

> On Thu, 1 Oct 2015 18:45:40 +0200
> Hendrik Leppkes <h.leppkes at gmail.com> wrote:
>
> > On Thu, Oct 1, 2015 at 6:39 PM, wm4 <nfxjfg at googlemail.com> wrote:
> > > On Thu, 1 Oct 2015 18:29:00 +0200
> > > Hendrik Leppkes <h.leppkes at gmail.com> wrote:
> > >
> > >> On Thu, Oct 1, 2015 at 6:13 PM, wm4 <nfxjfg at googlemail.com> wrote:
> > >> > This affects Annex B streams (such as demuxed from .ts and others).
> It
> > >> > also handles the format change in reinit-large_420_8-to-small_
> 420_8.h264
> > >> > correctly.
> > >> >
> > >> > Instead of passing through the extradata, create it on the fly it
> from
> > >> > the currently active SPS and PPS. Since reconstructing the PPS and
> SPS
> > >> > NALs would be very complicated and verbose, we use the NALs as they
> > >> > originally appeared in the bitstream.
> > >> >
> > >> > The code for writing the extradata is somewhat derived from
> > >> > libavformat/avc.c, but it's small and different enough that sharing
> it
> > >> > is not really worth it.
> > >> > ---
> > >> > Even though it requires changes in the general h264 decoder
> (previous
> > >> > patch), this solution is much cleaner and more robust than my patch
> > >> > from yesterday.
> > >> > ---
> > >> >  libavcodec/videotoolbox.c | 48 +++++++++++++++++++++++++++++-
> -----------------
> > >> >  1 file changed, 30 insertions(+), 18 deletions(-)
> > >> >
> > >> > diff --git a/libavcodec/videotoolbox.c b/libavcodec/videotoolbox.c
> > >> > index 9dec5fc..cc1e592 100644
> > >> > --- a/libavcodec/videotoolbox.c
> > >> > +++ b/libavcodec/videotoolbox.c
> > >> > @@ -77,28 +77,40 @@ int ff_videotoolbox_alloc_frame(AVCodecContext
> *avctx, AVFrame *frame)
> > >> >      return 0;
> > >> >  }
> > >> >
> > >> > +#define AV_W8(p, v) *(p) = (v)
> > >> > +
> > >> >  CFDataRef ff_videotoolbox_avcc_extradata_create(AVCodecContext
> *avctx)
> > >> >  {
> > >> > +    H264Context *h     = avctx->priv_data;
> > >> >      CFDataRef data = NULL;
> > >> > +    uint8_t *p;
> > >> > +    int vt_extradata_size = 6 + 3 + h->sps.data_size + 4 +
> h->pps.data_size;
> > >> > +    uint8_t *vt_extradata = av_malloc(vt_extradata_size);
> > >> > +    if (!vt_extradata)
> > >> > +        return NULL;
> > >> >
> > >> > -    /* Each VCL NAL in the bitstream sent to the decoder
> > >> > -     * is preceded by a 4 bytes length header.
> > >> > -     * Change the avcC atom header if needed, to signal headers of
> 4 bytes. */
> > >> > -    if (avctx->extradata_size >= 4 && (avctx->extradata[4] & 0x03)
> != 0x03) {
> > >> > -        uint8_t *rw_extradata = av_memdup(avctx->extradata,
> avctx->extradata_size);
> > >> > -
> > >> > -        if (!rw_extradata)
> > >> > -            return NULL;
> > >> > -
> > >> > -        rw_extradata[4] |= 0x03;
> > >> > -
> > >> > -        data = CFDataCreate(kCFAllocatorDefault, rw_extradata,
> avctx->extradata_size);
> > >> > -
> > >> > -        av_freep(&rw_extradata);
> > >> > -    } else {
> > >> > -        data = CFDataCreate(kCFAllocatorDefault,
> avctx->extradata, avctx->extradata_size);
> > >> > -    }
> > >> > -
> > >> > +    p = vt_extradata;
> > >> > +
> > >> > +    AV_W8(p + 0, 1); /* version */
> > >> > +    AV_W8(p + 1, h->sps.data[0]); /* profile */
> > >> > +    AV_W8(p + 2, h->sps.data[1]); /* profile compat */
> > >> > +    AV_W8(p + 3, h->sps.data[2]); /* level */
> > >> > +    AV_W8(p + 4, 0xff); /* 6 bits reserved (111111) + 2 bits nal
> size length - 3 (11) */
> > >> > +    AV_W8(p + 5, 0xe1); /* 3 bits reserved (111) + 5 bits number
> of sps (00001) */
> > >> > +    AV_WB16(p + 6, h->sps.data_size + 1);
> > >> > +    AV_W8(p + 8, NAL_SPS | (3 << 5)); // NAL unit header
> > >> > +    memcpy(p + 9, h->sps.data, h->sps.data_size);
> > >> > +    p += 9 + h->sps.data_size;
> > >> > +    AV_W8(p + 0, 1); /* number of pps */
> > >> > +    AV_WB16(p + 1, h->pps.data_size + 1);
> > >> > +    AV_W8(p + 3, NAL_PPS | (3 << 5)); // NAL unit header
> > >> > +    memcpy(p + 4, h->pps.data, h->pps.data_size);
> > >> > +
> > >> > +    p += 4 + h->pps.data_size;
> > >> > +    av_assert0(p - vt_extradata == vt_extradata_size);
> > >> > +
> > >> > +    data = CFDataCreate(kCFAllocatorDefault, vt_extradata,
> vt_extradata_size);
> > >> > +    av_free(vt_extradata);
> > >> >      return data;
> > >> >  }
> > >> >
> > >>
> > >> This will still fail spectacularly with a SPS/PPS change mid-stream. I
> > >> don't suppose it somehow accepts the SPS/PPS data in-band as well?
> > >
> > > Well, it worked for the resolution change sample. It _should_ be using
> > > SPS/PPS NALs that occur mid-stream and in-band. The h264 decoder will
> > > reinitialize itself when they change (at least in most cases), and then
> > > it will also reinitialize this hwaccel, which means the code above is
> > > actually run in this situation. The h->sps and h->pps fields will be
> > > set to whatever is actually used by the decoder at this point. So I'm
> > > hoping that this change is actually pretty correct.
> >
> > Is init re-called on resolution change alone, everything else the same?
> > I guess it may actually work. Although there may still be other
> > SPS/PPS changes that don't trigger it, so it seems a bit fragile.
>
> The code for this is in ff_h264_decode_slice_header() (look for
> needs_reinit). It actually detects whether the SPS changed at all
> (either if the ID of the referenced SPS changed, or if the referenced
> SPS was newly read). But then it tries to avoid reinitialization by
> checking whether specific fields changed. So it might not be 100%
> robust. Basically, the logic only respects what libavcodec's decoder
> needs.
>
> In theory, the hwaccel could export a flag that signals the need for
> full reinitialization on any change.

I'm looking into this further, because h264 samples with mid-stream SPS/PPS
changes currently fail to decode with the videotoolbox hwaccel.

I see the `must_reinit` and `needs_reinit` code in h264_slice.c, which
triggers reinitialization of the software decoder's internal state.
However, I don't see how this would ever cause videotoolbox to get
reinitialized (especially since AVVideotoolboxContext is usually created by
the user up-front).

I'm considering a patch to h264_videotoolbox's end_frame() which compares
the previous SPS/PPS(s) with the current to see if they've changed so
the VTDecompressionSession can be restarted. Alternatively, I could add a
new `reinit` callback to AVHWAccel and invoke that.

How do the other hwaccels handle mid-stream SPS changes?

Aman

>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>