[FFmpeg-devel] [Patch] Auto Insert of hwupload_cuda filter

Mark Thompson sw at jkqxz.net
Mon Jul 31 02:59:01 EEST 2017

On 26/07/17 10:18, Philip Langdale wrote:
> On 2017-07-25 09:41, Yogender Gupta wrote:
>> Currently combining CPU and CUDA filters requires insertion of
>> hwupload and download filters. I am trying to simply this by auto
>> insertion of these filters. This patch is for hwupload_cuda, but I
>> want to do this for hwdownload as well.
>> The attached patch automatically inserts hwupload_cuda filter when it
>> detects a GPU filter  after a CPU component.
>> Before the patch
>> ffmpeg.exe -y -i traffic_flow_1280x720_420.y4m -vf "hwupload_cuda,
>> scale_npp=176:144" out.h264
>> After the patch the command line works without inserting the
>> hwupload_cuda filter (auto inserted by ffmpeg)
>> ffmpeg.exe -y -i traffic_flow_1280x720_420.y4m -vf "scale_npp=176:144" out.h264
>> Thanks,
>> Yogender
> At various points in time, people have observed that hwupload_cuda should be obsolete as
> the generic hwupload filter should be sufficient in combination with a properly implement
> hwcontext (which cuda has). I don't think anyone has validated this properly, but that
> should be step one.

I believe it works - the example above becomes something like 'ffmpeg.exe -y -init_hw_device cuda=first_cuda_gpu:0 -i traffic_flow_1280x720_420.y4m -filter_hw_device first_cuda_gpu -vf "hwupload,scale_npp=176:144" out.h264'.  (Though I suspect the line was written for a build with h264_nvenc as the only included H.264 encoder.)

However, there is still a gap in ffmpeg.c support in that only one device is supported for all filters being used - hwupload_cuda may still have value to get around that limitation when you have multiple devices, though I do still dislike the creation of isolated devices inside filters.  That doesn't apply to non-ffmpeg.c use, though, so maybe it's not a very good argument for having it in the library.  There was some discussion in libav of making some new syntax/options to pass multiple devices, but the current construction with string parsing is very much not amenable to it, so it hasn't been done and we ended up with the single global -filter_hw_device option - thoughts very much welcome on how to improve this.

> With that established, auto-insertion of hwupload/download should be done in a generic way;
> I'm not the person to say how that should be done, but I'm pretty confident that the
> people who know this stuff (wm4, Mark Thompson, etc) would agree this is the preferred
> approach.

General auto-insert of hwupload is pretty difficult - it needs some additional support for hardware format negotiation in lavfi, because the software formats aren't exposed in a useful way.  The current query_formats setup of hwupload (via get_hwframe_constraints()) helps to make the upload format work, but it isn't necessarily a usable format for the following filters, and in any case isn't known executed early enough for the method being used here.  (Consider what decisions you need to make when passing an RGB software format to some hardware filter which might only work on YUV.)

On the other hand, since currently all CUDA drivers/devices support the same set of YUV formats (monoculture for the win, I guess - no older devices or different manufacturers to deal with...), the construction here probably does work ok for CUDA only.  So, dunno, maybe.

- Mark

More information about the ffmpeg-devel mailing list