[FFmpeg-trac] #10936(avcodec:new): H264 decoder skips frames when seeking to (non-IDR) recovery point near end of stream
FFmpeg
trac at avcodec.org
Wed Mar 27 20:35:34 EET 2024
#10936: H264 decoder skips frames when seeking to (non-IDR) recovery point near end
of stream
------------------------------------+--------------------------------------
Reporter: arch1t3cht | Type: defect
Status: new | Priority: normal
Component: avcodec | Version: git-master
Keywords: h264 | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
------------------------------------+--------------------------------------
==== Summary of the bug:
Given an H264 file with a recovery point SEI less than `2 *
max_num_reorder_frames` frames before the end of the video stream in
decoding order (and no IDR frames afterwards), seeking to this recovery
point and decoding from there will cause the decoder to drop some frames.
With multiple threads, only one frame will be dropped, with only one
thread it will be about `max_num_reorder_frames` frames.
==== How to reproduce:
Consider the following C code, compiled with latest libavformat/libavcodec
(`8ca57fcf9ed327a6c2d9c5345be9b7e0724ca048`)
{{{
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
int main(int argc, char *argv[]) {
if (argc != 5) {
printf("Usage: %s <filename> <threads> <has_b_frames> <seek pts>",
argv[0]);
return -1;
}
char *filename = argv[1];
int threads = atoi(argv[2]);
int has_b_frames = atoi(argv[3]);
int seek_pts = atoi(argv[4]);
int ret = 0;
AVFormatContext *format_ctx = NULL;
ret = avformat_open_input(&format_ctx, filename, NULL, NULL);
if (ret < 0) {
printf("Failed to open file\n");
return -1;
}
if (avformat_find_stream_info(format_ctx, NULL) < 0) {
printf("Failed to find stream info\n");
avformat_close_input(&format_ctx);
}
int video_stream = -1;
for (size_t i = 0; i < format_ctx->nb_streams; i++) {
if (format_ctx->streams[i]->codecpar->codec_type ==
AVMEDIA_TYPE_VIDEO) {
video_stream = i;
break;
}
}
av_seek_frame(format_ctx, video_stream, seek_pts, 0);
printf("Opening stream %d\n", video_stream);
const AVCodec *codec =
avcodec_find_decoder(format_ctx->streams[video_stream]->codecpar->codec_id);
AVCodecContext *codec_ctx = avcodec_alloc_context3(codec);
ret = avcodec_parameters_to_context(codec_ctx,
format_ctx->streams[video_stream]->codecpar);
if (ret < 0) {
printf("Failed to copy codec parameters\n");
return -1;
}
codec_ctx->thread_count = threads;
codec_ctx->has_b_frames = has_b_frames;
ret = avcodec_open2(codec_ctx, codec, NULL);
if (ret < 0) {
printf("Failed to open decoder\n");
return -1;
}
AVPacket *pkt = av_packet_alloc();
int received_frames = 0;
while (1) {
AVFrame *frame = av_frame_alloc();
int ret = avcodec_receive_frame(codec_ctx, frame);
if (ret == 0) {
printf(" Received frame %d with PTS=%ld \n",
received_frames, frame->pts);
received_frames++;
} else if (ret == AVERROR(EAGAIN)) {
printf(" EAGAIN when receiving frame\n");
} else if (ret == AVERROR_EOF) {
printf(" EOF\n");
break;
} else {
return -1;
}
av_frame_free(&frame);
while (1) {
ret = av_read_frame(format_ctx, pkt);
if (ret < 0 && ret != AVERROR_EOF)
return -1;
if (ret == AVERROR_EOF || pkt->stream_index == video_stream)
break;
av_packet_unref(pkt);
}
ret = avcodec_send_packet(codec_ctx, pkt);
if (ret != AVERROR(EAGAIN)) {
if (pkt->data) {
printf("Sent packet DTS=%ld PTS=%ld\n", pkt->dts,
pkt->pts);
} else {
printf("Sent empty packet\n");
}
} else {
printf("Got EAGAIN when sending packet\n");
}
av_packet_unref(pkt);
}
av_packet_free(&pkt);
avformat_close_input(&format_ctx);
return 0;
}
}}}
This just opens the given file, seeks to the given PTS, sets
`codec_ctx->has_b_frames` (equivalently, one could change the
`max_num_reorder_frames` field in the file's bitstream, but this is
quicker) and the number of threads to the given value, and then runs a
standard decoding loop.
Also consider the attached file `interlaced_h264.mkv`. The file is an
open-gop file with 126 frames and has a recovery point (among others) at
PTS 3840 (frame 93 in decoding order, counting from 0, the last frame has
PTS 5000). The value of `max_num_reorder_frames` in the bitstream is 2
(but instead of repeatedly changing the value in the bitstream we simulate
this here by setting `codec_ctx->has_b_frames`).
Running `./myprogram interlaced_h264.mkv 16 2 0` shows that all frames are
output as usual. When seeking to PTS 3840 with `./myprogram
interlaced_h264.mkv 16 2 3840`, all frames after PTS 3840 are properly
output, resulting in 29 output frames:
{{{
[...]
Received frame 27 with PTS=4960
Sent empty packet
Received frame 28 with PTS=5000
Sent empty packet
EOF
}}}
However, when setting `has_b_frames=15` instead with `./myprogram
interlaced_h264.mkv 16 15 3840`, a frame is skipped:
{{{
[...]
Sent empty packet
Received frame 14 with PTS=4400
Sent empty packet
Received frame 15 with PTS=4440
Sent empty packet
Received frame 16 with PTS=4520
Sent empty packet
Received frame 17 with PTS=4560
[...]
Received frame 28 with PTS=5000
Sent empty packet
EOF
}}}
We see that the frame with PTS 4480 is no longer output. The same happens
with two threads (`./myprogram interlaced_h264.mkv 2 15 3840`). With just
one thread (`./myprogram interlaced_h264.mkv 1 15 3840``), the decoder
stops completely after frame 4440 and returns EOF after that:
{{{
[...]
Received frame 14 with PTS=4400
Sent empty packet
Received frame 15 with PTS=4440
Sent empty packet
EOF
}}}
==== Analysis:
The problematic frame with PTS 4480 is skipped because its `recovered`
value is 0, so it's not output by `finalize_frame` in `h264dec.c`.
Normally, frames to be output are marked as `recovered` in
`h264_select_output_frame` as soon as the `H264Context`'s
`frame_recovered` field is set (which happens as soon as the recovery
point's frame is output by `h264_select_output_frame`). They're also
marked as `recovered` in `h264_field_start` when `h->frame_recovered` is
set.
However, the frame with PTS 4480 is decoded and added to the delayed
picture buffer before the recovery point frame is output, so it's not set
to `recovered` in `h264_field_start`. But the frame is only output during
draining after all packets are consumed, so it's output with
`send_next_delayed_frame` and not `h264_select_output_frame`. In
`send_next_delayed_frame`, the frame is not set to `recovered`, so it's
not output by `finalize_frame`. Depending on the threading logic, this
either skips the frame or stops decoding entirely.
So, the precise bug is probably that `send_next_delayed_frame` does not
set `recovered`, but I'll leave discussing specific patches to the mailing
list.
--
Ticket URL: <https://trac.ffmpeg.org/ticket/10936>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list