[FFmpeg-devel] [PATCH] Added QSV based VPP filter - second try

Ivan Uskov ivan.uskov at nablet.com
Thu Nov 5 16:35:33 CET 2015

Hello wm4,

Thursday, November 5, 2015, 5:07:08 PM, you wrote:

>> >> > > +            } else if (ret == MFX_WRN_DEVICE_BUSY) {
>> >> > > +                av_usleep(500);    
>> >> > 
>> >> > What. Use proper event-based waiting.  
>> It is not possible.
>> >> 
>> >> That´s the same behavior as we have in the qsv encoder and decoder.
>> >> And as far as I know this is how Intel recommends to handle this.  
>> w> That's just ridiculous. Can you send some hate-mail to Intel and tell
>> w> them what a bad idea this is? Half a millisecond is an eternity for a
>> w> CPU. What if the device is blocked only for 10 microseconds? Then it
>> w> will waste time by spending 490 microseconds idly.  
>> 1. Please remember we use GPU, not CPU.

w> That makes it even worse, because the CPU could literally be entirely
w> idle.
Not mandatory.
There are following scenarios are possible:
1.  transcoding  completely  executes by QSV components. At this case CPU always
almost  idle and it is not issue at all. It is most probably scenario when we
theoretically can get MFX_WRN_DEVICE_BUSY and CPU loading does not matter.
2.   SW   components  like  encoder  or  decoder  works  together   with  QSV
components.  At this case  possible  a scenario  when GPU is busy but CPU still
executes some thread pool (inside SW encoder for example).

>> 2.   500us  means that even we will get MFX_WRN_DEVICE_BUSY at every frame we
>> will    able     to    achieve    ~2000fps  performance.  It   looks   enough
>> performance level for any practical applications.

w> Only if all other CPU processing takes 0 microseconds.
Here  can  be  other threads which will very happy if we will slip until GPU
busy. Also we never will get MFX_WRN_DEVICE_BUSY at each frame.
I  just would like to point that delay has not big impact to real performance
which usually much less than 2000fps.

>> 3.   In   real  life  MFX_WRN_DEVICE_BUSY does appear when GPU really busy by
>> other  tasks. So nothing bad will appear if one thread/process will sleep for
>> 500us to make another thread complete its work.
>> w> Software engineers recognized that polling is a bad idea half a century
>> w> ago. Why can't Intel do this right?  
>> May  be  because  it  is  complex to organize event-polling when calculations
>> performs in GPU?

w> Even just making the call blocking would be 1. easier, 2. more
w> efficient (because it will idle only as long as needed).
I believe Intel had serious reasons do not implement blocking here.
General   processing  in  QSV  is  asynchronous  and  has  nice functions  to
wait completion of encoding/decoding/processing.
If   Intel   made   MFX_WRN_DEVICE_BUSY   as  immediately return code without
event   handling  and  still keep it as is during library 16 releases that it
has the reason.
For  example  here  can be a small penalty during general processing which will
give visible overhead for hundreds frames per second.

For any case do not have an ability to change this API.
Best regards,
 Ivan                            mailto:ivan.uskov at nablet.com

More information about the ffmpeg-devel mailing list