[FFmpeg-devel] [PATCH 2/2] avcodec/videotoolbox: fix decoding of some h264 bitstreams

wm4 nfxjfg at googlemail.com
Tue Sep 26 15:06:16 EEST 2017

On Mon, 25 Sep 2017 11:49:51 -0700
Aman Gupta <ffmpeg at tmm1.net> wrote:

> On Mon, Sep 25, 2017 at 3:06 AM, wm4 <nfxjfg at googlemail.com> wrote:
> > On Mon, 25 Sep 2017 09:02:36 +0200
> > Hendrik Leppkes <h.leppkes at gmail.com> wrote:
> >  
> > > On Mon, Sep 25, 2017 at 3:31 AM, Aman Gupta <ffmpeg at tmm1.net> wrote:  
> > > >
> > > > How do the other hwaccels handle mid-stream SPS changes?
> > > >  
> > >
> > > Real HWAccels (ie. VAAPI, VDPAU or DXVA) communicate the SPS/PPS
> > > content for every frame, they don't keep a persistent state internally
> > > - that way the only "state" is the frame size and pixel format, and
> > > when those change get_format is called and the hwaccel re-initialized.  
> >
> > Maybe it would be better if VT detected SPS/PPS changes itself, and
> > reinitialized the VT session on demand when feeding slices. This way we
> > wouldn't have to mess with the normal h264 software decoder reinit
> > logic.
> >  
> Agreed. I'm trying to figure out the most efficient way to detect when the
> SPS/PPS changes.
> Previously I tried feeding in the SPS/PPS NALs from the decoder into the VT
> hwaccel (with a new `decode_params` callback), and using that to store a
> copy of the NAL in the hwaccel. I used a memcmp() against the previously
> stored value to detect when changes occurred and restarted the
> decompression session according. This approach worked well, but doesn't
> handle some streams which use multiple PPS.
> The most fool-proof way would be to construct a new avcC every time and
> only restart the session when it changes. But that seems quite expensive to
> be doing all the time.
> I'm also still not sure if the VT decoder needs to be restarted on SPS
> changes only, or PPS as well. Do new PPS usually accompany a new SPS? Is it
> also possible to have multiple SPS used at the same time? (The avcC
> construction code in videotoolbox.c assumes one SPS and one or more PPS).

I don't think it's too expensive. Keep in mind that we copy around the
actual slices 2 times as well, and the cost of rebuilding the avcc is
probably the smallest part, and the memcmp would barely matter.

Also I think what matters is whether VT sees the SPS/PPS slices at all.
Didn't you have success with feeding them as part of the sample buffer
(along with slice NALs)? I wonder if VT would be fine with the SPS and
PPS repeated for every slice.

More information about the ffmpeg-devel mailing list