[FFmpeg-devel] [PATCH] libavcodec/cuviddec.c: increase CUVID_DEFAULT_NUM_SURFACES
Scott Theisen
scott.the.elm at gmail.com
Sat Feb 22 04:52:55 EET 2025
On 2/21/25 08:26, Timo Rothenpieler wrote:
> On 20.02.2025 21:37, Scott Theisen wrote:
>> The default value of CuvidContext::nb_surfaces was reduced from 25 to
>> 5 (as
>> (CUVID_MAX_DISPLAY_DELAY + 1)) in
>> 402d98c9d467dff6931d906ebb732b9a00334e0b.
>>
>> In cuvid_is_buffer_full() delay can be 2 * CUVID_MAX_DISPLAY_DELAY
>> with double
>> rate deinterlacing. ctx->nb_surfaces is CUVID_DEFAULT_NUM_SURFACES =
>> (CUVID_MAX_DISPLAY_DELAY + 1) by default, in which case
>> cuvid_is_buffer_full()
>> will always return true and cuvid_output_frame() will never read any
>> data since
>> it will not call ff_decode_get_packet().
>
> It's been way too long since I looked at all that code, and I didn't
> even write most of the code involved:
>> https://github.com/FFmpeg/FFmpeg/commit/bddb2343b6e594e312dadb5d21b408702929ae04
>>
>> https://github.com/FFmpeg/FFmpeg/commit/402d98c9d467dff6931d906ebb732b9a00334e0b
>>
>
> But doesn't this instead mean that the logic in cuvid_is_buffer_full
> is flawed somehow?
I think it is the number of frames ready to send to the driver + the
number of frames in queue in the driver >= the number of decoded frame
buffers. However, it doesn't actually know how many frames are in queue
in the driver and assumes the maximum.
> Just increasing the default number of surfaces does not seem like the
> correct fix or sensible, since it will increase VRAM usage by
> potentially quite a bit for all users.
>
The changes to cuvid_handle_video_sequence() from
402d98c9d467dff6931d906ebb732b9a00334e0b will increase nb_surfaces once
data has been read.
> From looking at this a bit, the issue will only happen when
> deinterlacing, the logic in cuvid_is_buffer_full becomes stuck then,
> and will always claim the buffer is full.
> And from my understanding, it's correct in making that claim. Due to
> the display delay, it could in theory happen that the moment cuvid
> starts outputting frames, there will be more output available than
> what fits into ctx->frame_queue, since it delayed by 4 frames, which
> results in 8 surfaces, but the queue only fits 5.
>
> So to me it looks like that the correct fix would be to double the
> size of the frame_queue when deinterlacing, not unconditionally.
There is nothing stopping deint_mode or drop_second_field from being
changed after cuvid_decode_init() is called, so it doesn't necessarily
know it will deinterlace.
Regardless, 402d98c9d467dff6931d906ebb732b9a00334e0b reduced
CUVID_DEFAULT_NUM_SURFACES from 25 to *only 5* to not break playback
entirely. I don't think the intention was to break playback for double
rate deinterlacing while allowing playback for only single rate
deinterlacing.
Also, if AV_CODEC_FLAG_LOW_DELAY is set, then only one output surface is
needed, but there are still 5.
>
> nb_surfaces is also used to determine the size of the key_frame array,
> which would then also be pointlessly doubled. But not like a handful
> of extra ints would hurt that much though.
> Alternatively the size-doubling could not be reflected in nb_surfaces,
> but that would make the logic in various other places be more
> complicated.
>
>> ---
>>
>> I think part of the problem might be that cuvid_is_buffer_full() does
>> not know
>> how many frames are actually in the driver's queue and assumes it is the
>> maximum, even if none have yet been added.
>>
>> This was preventing any frames from being decoded using NVDEC with
>> MythTV for
>> some streams. See https://github.com/MythTV/mythtv/issues/1039
>
> I'd highly recommend to not use cuviddec anymore, but instead use nvdec.
> cuviddec only still exists to sanity-check nvdec against it at times.
.*_cuvid are FFCodecs while .*_nvdec are FFHWAccels. I don't know what
would be required to change to .*_nvdec and the .*_cuvid FFCodecs work
fine with this change.
>
>> ---
>> libavcodec/cuviddec.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
>> index 67076a1752..05dcafab6e 100644
>> --- a/libavcodec/cuviddec.c
>> +++ b/libavcodec/cuviddec.c
>> @@ -120,7 +120,7 @@ typedef struct CuvidParsedFrame
>> #define CUVID_MAX_DISPLAY_DELAY (4)
>> // Actual pool size will be determined by parser.
>> -#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
>> +#define CUVID_DEFAULT_NUM_SURFACES ((2 * CUVID_MAX_DISPLAY_DELAY) + 1)
>> static int CUDAAPI cuvid_handle_video_sequence(void *opaque,
>> CUVIDEOFORMAT* format)
>> {
More information about the ffmpeg-devel
mailing list