[FFmpeg-devel] [PATCH] flvdec: option for dropping negative CTS frames Initial frames with negative pts can produce video/audio desynchronization when a decoder is not prepared to handle negative pts. For example: QSV transcoding from RTMP Wowza server

Thu Apr 6 20:28:15 EEST 2017

On Thu, 6 Apr 2017 14:18:20 -0300
Felipe Astroza <felipe at astroza.cl> wrote:

> 2017-04-06 2:00 GMT-03:00 wm4 <nfxjfg at googlemail.com>:
> 
> > On Wed, 5 Apr 2017 17:15:26 -0300
> > Felipe Astroza <felipe at astroza.cl> wrote:
> >  
> > > 2017-04-05 15:35 GMT-03:00 wm4 <nfxjfg at googlemail.com>:
> > >  
> > > > On Wed,  5 Apr 2017 14:29:30 -0300
> > > > felipe at astroza.cl wrote:
> > > >  
> > > > > From: Felipe Astroza <felipe at astroza.cl>
> > > > >
> > > > > Signed-off-by: Felipe Astroza <felipe at astroza.cl>
> > > > > ---
> > > > >  libavformat/flvdec.c | 14 +++++++++++---
> > > > >  1 file changed, 11 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/libavformat/flvdec.c b/libavformat/flvdec.c
> > > > > index 3959a36..1556fe0 100644
> > > > > --- a/libavformat/flvdec.c
> > > > > +++ b/libavformat/flvdec.c
> > > > > @@ -44,6 +44,7 @@
> > > > >  typedef struct FLVContext {
> > > > >      const AVClass *class; ///< Class for private options.
> > > > >      int trust_metadata;   ///< configure streams according  
> > onMetaData  
> > > > > +    int drop_negative_cts;///< drop frames if cts is negative
> > > > >      int wrong_dts;        ///< wrong dts due to negative cts
> > > > >      uint8_t *new_extradata[FLV_STREAM_TYPE_NB];
> > > > >      int new_extradata_size[FLV_STREAM_TYPE_NB];
> > > > > @@ -1139,10 +1140,16 @@ retry_duration:
> > > > >              int32_t cts = (avio_rb24(s->pb) + 0xff800000) ^  
> > 0xff800000;  
> > > > >              pts = dts + cts;
> > > > >              if (cts < 0) { // dts might be wrong
> > > > > -                if (!flv->wrong_dts)
> > > > > +                if (flv->drop_negative_cts) {
> > > > >                      av_log(s, AV_LOG_WARNING,
> > > > > -                        "Negative cts, previous timestamps might be  
> > > > wrong.\n");  
> > > > > -                flv->wrong_dts = 1;
> > > > > +                            "Negative cts, frames will be  
> > dropped.\n");  
> > > > > +                    dts = pts = AV_NOPTS_VALUE;
> > > > > +                } else {
> > > > > +                    if (!flv->wrong_dts)
> > > > > +                        av_log(s, AV_LOG_WARNING,
> > > > > +                            "Negative cts, previous timestamps  
> > might be  
> > > > wrong.\n");  
> > > > > +                    flv->wrong_dts = 1;
> > > > > +                }
> > > > >              } else if (FFABS(dts - pts) > 1000*60*15) {
> > > > >                  av_log(s, AV_LOG_WARNING,
> > > > >                         "invalid timestamps %"PRId64" %"PRId64"\n",  
> > dts,  
> > > > pts);  
> > > > > @@ -1253,6 +1260,7 @@ static int flv_read_seek(AVFormatContext *s,  
> > int  
> > > > stream_index,  
> > > > >  #define VD AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_DECODING_PARAM
> > > > >  static const AVOption options[] = {
> > > > >      { "flv_metadata", "Allocate streams according to the onMetaData  
> > > > array", OFFSET(trust_metadata), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1,  
> > VD },  
> > > > > +    { "flv_drop_negative_cts", "Drop frames with negative  
> > composition  
> > > > timestamp", OFFSET(drop_negative_cts), AV_OPT_TYPE_BOOL, { .i64 = 0 },  
> > 0,  
> > > > 1, VD },  
> > > > >      { "missing_streams", "", OFFSET(missing_streams),  
> > AV_OPT_TYPE_INT,  
> > > > { .i64 = 0 }, 0, 0xFF, VD | AV_OPT_FLAG_EXPORT | AV_OPT_FLAG_READONLY  
> > },  
> > > > >      { NULL }
> > > > >  };  
> > > >
> > > > This seems all kind of wrong. You don't add a hack to a single demuxer
> > > > just because a single decoder can't handle unusual things in "some"
> > > > files. You don't add it as option either. (If this is a "fix my problem
> > > > the easiest way" hack, you should probably keep it in your own ffmpeg
> > > > branch.)
> > > >
> > > > It was the way I found to avoid the initial frames without a preceding  
> > > keyframe (marked with pts < 0) that RTMP wowza server sends in live
> > > streams, just cover flv format case :/. And yes yes, you're right, this  
> > is  
> > > a hack because of I was not able to patch QSV decoder.
> > >
> > > h264_qsv decoder -> h264_qsv encoder produces a video delayed output
> > > h264_qsv decoder -> libx264 encoder produces a video delayed output
> > > libx264 decoder -> libx264 encoder produces a right output  
> >
> > There's no libx264 decoder - I assume you mean ffmpeg's native decoder.
> >  
> > > h264_qsv is the source of my issues. I was passing -itsoffset  
> > CONSTANT(0.5  
> > > in my case) as workaround but it works 90% of the time and I just want a
> > > definitive solution.  
> >
> > Did you check whether there's some obvious cause, like due to how qsv
> > represents timestamps? Also there is no reason to use the qsv
> > _decoder_. The native ffmpeg decoder + hwaccel will do getter. Anyway,
> > still legitimate to want to fix qsv, of course.
> >  
> 
> I'm not sure of that. Reading input at native frame rate:
> 
> * h264 native decoder -> h264_qsv encoder (needs hwcontext)
> command: ffmpeg -re -loglevel verbose -hwaccel qsv -qsv_device
> /dev/dri/renderD129 -i INPUT -c:v h264_qsv -look_ahead 0 -profile:v high
> -preset:v veryfast -bufsize 1000k -r 30 -b:v 3440800 -maxrate 3440800 -c:a
> aac test.mp4
> 
> Stream mapping:
>   Stream #0:1 -> #0:0 (h264 (native) -> h264 (h264_qsv))
>   Stream #0:2 -> #0:1 (aac (native) -> aac (native))
> [h264_qsv @ 0x25052a0] Warning in encoder initialization: partial
> acceleration (4)
> 
> *CPU utilization is 125%*
> 
> * h264_qsv decoder -> h264_qsv encoder
> command: ffmpeg -re -loglevel verbose -hwaccel qsv -qsv_device
> /dev/dri/renderD129 -c:v h264_qsv -i INPUT -c:v h264_qsv -look_ahead 0
> -profile:v high -preset:v veryfast -bufsize 1000k -r 30 -b:v 3440800
> -maxrate 3440800 -c:a aac test.mp4
> 
> Stream mapping:
>   Stream #0:1 -> #0:0 (h264 (h264_qsv) -> h264 (h264_qsv))
>   Stream #0:2 -> #0:1 (aac (native) -> aac (native))
> 
> *CPU utilization is 22%*
> 
> I am using with QSV to take off load from CPU and native decoding does not
> help.

That doesn't use hwaccel. "-hwaccel qsv" will do nothing with the
native decoder.

I'm not sure yet whether the frame mapping stuff (for vaapi->qsv
encoding) is ported from Libav yet, or how it works. Maybe Mark
Thompson can say something about the expected performance.