[FFmpeg-devel] h264_qsv decoder speed

Chao Liu yijinliu at gmail.com
Thu Aug 18 02:01:57 EEST 2016


On Wed, Aug 17, 2016 at 3:13 PM, Mark Thompson <sw at jkqxz.net> wrote:

> On 17/08/16 20:47, Chao Liu wrote:
> > Hi there,
> > I compared h264_qsv decoder from ffmpeg to intel media sdk sample_decode.
> > There is pretty big speed gap. I wonder whether I did sth. wrong or there
> > are really some problems with ffmpeg's implementation..
> >
> > The test video was captured from a 3MP(2048x1536) camera. The commands I
> > used:
> > -  ffmpeg -c:v h264_qsv -async_depth 10 -i test.h264 -c:v rawvideo -f
> null
> > /dev/null
> > -  sample_decode h264 -i test.h264
> > Both uses 100% cpu (a full core). ffmpeg got 170FPS. sample_decode got
> > 370FPS.
> >
> > I haven't got time debugging into this. Sending this out to see whether
> you
> > guys might have sth. in mind..
>
> I think in both cases your speed bound must be on something other than the
> decode, because the hardware goes a lot faster than either of those for
> me.  Perhaps you are downloading the all of the output frames to normal
> memory in order to write them to a null device output, and one of the cases
> is doing that less efficiently somehow?
>
You are right. QSV does output video to system memory by default.
For sample_decode, I could turn that off and got 700 FPS (using a 4th gen
mobile i3, with HD 4400). This is very close to your number.
      sample_decode h264 -vaapi -i test.h264
So I assume the number from sample_decode makes sense.

I just found the parameter "-c:v rawvideo" affects the performance a lot
(why?). Using the following command, I could get 240fps. (I changed
async_depth to 4 because this is what sample_code uses.)
      ffmpeg -c:v h264_qsv -async_depth 4 -i ../../4shome2/go/orig.h264 -f
null -
So the speed difference is 370 (sample_decode) VS 240 (ffmpeg qsv)

>
> Using vaapi on a low-power Haswell mobile chip (i.e. the same Quick Sync
> hardware that libmfx uses) decodes a single 2048x1536 stream at around
> 800fps with less than 50% CPU for me.
>
> - Mark
>
>
> (My command to compare is:
>
> ./ffmpeg_g -vaapi_device /dev/dri/renderD128 -hwaccel vaapi
> -hwaccel_output_format vaapi -i input.mp4 -an -vf
> 'format=nv12|vaapi,hwupload' -f null -
>
> The nasty filtering there is contrived to do nothing, even with the
> inconvenient stream reinitialisation.  I think libmfx might also work
> somehow with "-c:v h264_qsv -hwaccel qsv", but I'm not sure and I don't
> have anything to try it on right now.)


> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list