[FFmpeg-devel] [PATCH] libavcodec/nvenc.c: copy incoming hwaccel frames instead of ref count increase

Tue Sep 28 20:58:16 EEST 2021

On 28.09.2021 18:22, Roman Arzumanyan wrote:
> Hello,
> 
> This patch makes nvenc copy incoming hwaccel frames instead of ref count increase.
> It fixes the bug which may happen when on-GPU transcoding is done and encoder is set to use B frames.
> 
> How to reproduce the bug:
> ./ffmpeg \
>    -hwaccel cuda -hwaccel_output_format cuda \
>    -i input.mkv \
>    -c:v h264_nvenc -preset p4 -tune hq -bf 3 \
>    -y output.mkv
> 
> Expected output:
> [h264 @ 0x55b14da4b4c0] No decoder surfaces left
> [h264 @ 0x55b14da682c0] No decoder surfaces left
> [h264 @ 0x55b14da850c0] No decoder surfaces left
> [h264 @ 0x55b14daa1ec0] No decoder surfaces left
> Error while decoding stream #0:0: Invalid data found when processing input
> [h264 @ 0x55b14da2e6c0] No decoder surfaces left
> Error while decoding stream #0:0: Invalid data found when processing input
>      Last message repeated 1 times
> 
> 
> Although fix adds extra CUDA DtoD memcopy, our internal testing results didn't show any noticeable difference in transcoding performance.
> 

Hmm, so far my approach to deal with this was to inject a 
scale_cuda=passthrough=0 into the filter chain, which pretty much does 
exactly this, but only controllable by the user.

But I do agree that this is a bit of a clutch and not all that user 
friendly.

My main concern with this approach is that it will inevitably increase 
VRAM usage, depending on bframe count and resolution even quite 
significantly.
And it's surprisingly common that users show up that are highly pressed 
for memory. When bframes were switched on by default, several people 
showed up who where suddenly running out of VRAM.

I do like this approach though, since it will for the average user make 
using a full hw chain a lot less bothersome.

So what I'd propose is:

- Add an option to retain the old behaviour of just holding a reference 
to the input frame no matter what.
- Instead of explicitly copying the frame like you do right now, call 
av_frame_make_writable() on the frame, right after where you right now 
are replacing av_frame_ref with av_hwframe_transfer_data.
That is for one very easy to disable conditionally, and does not require 
you to guard all the unref calls.
Plus, it will only actually copy the frame if needed (i.e. it won't do 
anything if it comes out of a filterchain and has nothing else holding a 
ref)

Timo

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4494 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20210928/311a1f78/attachment.bin>