[FFmpeg-trac] #10374(avcodec:new): Problems to decode certain h264 keyframes with D3D11VA and DXVA2 hw acceleration
FFmpeg
trac at avcodec.org
Fri May 19 00:50:52 EEST 2023
#10374: Problems to decode certain h264 keyframes with D3D11VA and DXVA2 hw
acceleration
-------------------------------------+-------------------------------------
Reporter: Florian | Type: defect
Grill |
Status: new | Priority: normal
Component: avcodec | Version: 6.0
Keywords: H264 dxva2 | Blocked By:
d3d11va |
Blocking: | Reproduced by developer: 1
Analyzed by developer: 1 |
-------------------------------------+-------------------------------------
Summary of the bug:
I'm currently developing a streaming app for Windows and for my project I
use the javacv project to use the native ffmpeg libraries
[https://github.com/bytedeco/javacv]. Developing the first prototype was
quite easy but upon further testing on different machines with different
hardware decoders I noticed some strange effects with the D3D11VA and
DXVA2 video decoder. It was just not possible to decode certain keyframes
but cuvid or the software decoder had no problem to do so. On these
problematic keyframes I always received an error when calling
{{{
final int err = avcodec_send_packet(this.m_VideoDecoderCtx, this.pkt);
}}}
The logs printed out ''"Invalid data found when processing input"'' but I
was 100% sure that the data was correct so I started to dig into source
code and did some research on the internet and I found the issue. It seems
that in h264dec.h there is a variable called MAX_SLICES which is set to
32.
[https://github.com/FFmpeg/FFmpeg/blob/release/6.0/libavcodec/h264dec.h#L62]
This MAX_SLICES value is used here dxva2_h264.c
[https://github.com/FFmpeg/FFmpeg/blob/release/6.0/libavcodec/dxva2_h264.c#L478]
which leads to the fact that frames with larger slices simply can't be
processed. This is strange limitation if you ask me.
Other popular ffmpeg forks already increased this limit, e.g.
[https://git.1f0.de/gitweb/?p=ffmpeg.git;a=commit;h=550cf548b546d386a6c634351ad0c250a3e47f3b;js=1]
In my example I used
{{{
this.m_VideoDecoderCtx.err_recognition(this.m_VideoDecoderCtx.err_recognition()
| AV_EF_EXPLODE);
}}}
to get a proper error output but for testing I removed it and the frame
that was processed and displayed looked like that
[[Image(https://github.com/grill2010/ExampleDXVA2/blob/main/example_data/test.jpg)]]
This behaviour can be easily reproduced with the latest ffmpeg version.
The example h264 keyframe can be downloaded from here
[https://github.com/grill2010/ExampleDXVA2/blob/main/example_data/example.h264]
How to reproduce:
{{{
% .\ffmpeg.exe -hwaccel dxva2 -i .\example.h264 test.jpg
or
% .\ffmpeg.exe -hwaccel d3d11va -i .\example.h264 test.jpg
ffmpeg version 6.0 and latest SNAPSHOT
built via https://www.gyan.dev/ffmpeg/builds/
}}}
On the ffmpeg version of javacv there was recently a patch for this small
numbers of slices
[https://github.com/bytedeco/javacpp-
presets/commit/63acf680ef0d95cbdda1b3840450e4333a78bde0]
and I can confirm that this fixes the issue completely (at least for
D3D11VA). Btw. I found another ticket which might be related to this
problem
[https://trac.ffmpeg.org/ticket/9771]
----
Like I said a fix was applied to the javacv ffmpeg version so I tried to
compile this ffmpeg version for Windows x86_x64 with this increased
MAX_SLICES for further testing. D3D11VA works perfectly fine now but I
noticed another issue with DXVA2. Still on certain keyframes I received an
error but this time a different one. When calling
{{{
final int err = avcodec_send_packet(this.m_VideoDecoderCtx, this.pkt);
}}}
I receive now
{{{
[h264 @ 0000027a270a1680] Buffer for type 5 was too small. size: 58752,
dxva_size: 55296
[h264 @ 0000027a270a1680] Failed to add bitstream or slice control buffer
[h264 @ 0000027a270a1680] hardware accelerator failed to decode picture
Error while decoding stream #0:0: Operation not permitted
}}}
Upon further investigation it turned out that the decoding of these
certain keyframes fail because the buffer returned from the
IDirectXVideoDecoder_GetBuffer function was too small. And I don't have
any explanation why. DXVA2 works perfectly fine btw for h265 (not affected
by the MAX_SLICES problem and not by the too small buffer problem).
[https://github.com/FFmpeg/FFmpeg/blob/9d70e74d255dbe37af52b0efffc0f93fd7cb6103/libavcodec/dxva2.c#L817]
I tested this behaviour now on 4 different PCs and the pattern I found is
that it works on PCs with AMD or Intel GPUs but if you have an NVIDIA GPU
you get this small buffer problem.
How to reproduce:
{{{
% .\ffmpeg.exe -hwaccel dxva2 -i .\example.h264 test.jpg
ffmpeg version with a patch which sets MAX_SLICES in dxva2_h264.c to 256
built via javacv (java-presets build process)
}}}
I have provided here a Windows x64 ffmpeg build with that MAX_SLICES patch
[https://github.com/grill2010/ExampleDXVA2/tree/main/example_data]
----
I know that these are probably two bug reports in one but for me these
things are related. The first problem with these small slices are probably
an easy fix, the question is why is it like that in the first place? Other
hw decoder for example like the dxva2 hevc decoder are overriding this
small slice value but not dxva2 h264? Any reason for that? This is
actually quite limiting the h264 hw acceleration for DXVA2 and D3D11VA.
For the second problem with the too small buffer on DXVA2 I have
absolutely no idea what could cause this issue and why it (seemingly) only
occur on devices with an NVIDIA GPU. I would like to start some further
testing but no idea where to start.
I tried to provide as much details as possible if there are any questions
just let me know.
--
Ticket URL: <https://trac.ffmpeg.org/ticket/10374>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list