[FFmpeg-devel] h264_qsv decoder speed

Andy Furniss adf.lists at gmail.com
Thu Aug 18 02:41:28 EEST 2016

Mark Thompson wrote:
> On 17/08/16 20:47, Chao Liu wrote:
>> Hi there,
>> I compared h264_qsv decoder from ffmpeg to intel media sdk sample_decode.
>> There is pretty big speed gap. I wonder whether I did sth. wrong or there
>> are really some problems with ffmpeg's implementation..
>> The test video was captured from a 3MP(2048x1536) camera. The commands I
>> used:
>> -  ffmpeg -c:v h264_qsv -async_depth 10 -i test.h264 -c:v rawvideo -f null
>> /dev/null
>> -  sample_decode h264 -i test.h264
>> Both uses 100% cpu (a full core). ffmpeg got 170FPS. sample_decode got
>> 370FPS.
>> I haven't got time debugging into this. Sending this out to see whether you
>> guys might have sth. in mind..
> I think in both cases your speed bound must be on something other than the decode, because the hardware goes a lot faster than either of those for me.  Perhaps you are downloading the all of the output frames to normal memory in order to write them to a null device output, and one of the cases is doing that less efficiently somehow?

Only tested with AMD UVD, but unless you use -pix_fmt nv12 you will also 
get cpu load from ffmpeg doing nv12 -> yuv420p conversion.

> Using vaapi on a low-power Haswell mobile chip (i.e. the same Quick Sync hardware that libmfx uses) decodes a single 2048x1536 stream at around 800fps with less than 50% CPU for me.
> - Mark
> (My command to compare is:
> ./ffmpeg_g -vaapi_device /dev/dri/renderD128 -hwaccel vaapi -hwaccel_output_format vaapi -i input.mp4 -an -vf 'format=nv12|vaapi,hwupload' -f null -

Oh nice, I always wondered if there was a way to bench without copy back.

> The nasty filtering there is contrived to do nothing, even with the inconvenient stream reinitialisation.  I think libmfx might also work somehow with "-c:v h264_qsv -hwaccel qsv", but I'm not sure and I don't have anything to try it on right now.)

More information about the ffmpeg-devel mailing list