[FFmpeg-trac] #10936(avcodec:new): H264 decoder skips frames when seeking to (non-IDR) recovery point near end of stream

Wed Mar 27 20:35:34 EET 2024

#10936: H264 decoder skips frames when seeking to (non-IDR) recovery point near end
of stream
------------------------------------+--------------------------------------
             Reporter:  arch1t3cht  |                     Type:  defect
               Status:  new         |                 Priority:  normal
            Component:  avcodec     |                  Version:  git-master
             Keywords:  h264        |               Blocked By:
             Blocking:              |  Reproduced by developer:  0
Analyzed by developer:  0           |
------------------------------------+--------------------------------------
 ==== Summary of the bug:
 Given an H264 file with a recovery point SEI less than `2 *
 max_num_reorder_frames` frames before the end of the video stream in
 decoding order (and no IDR frames afterwards), seeking to this recovery
 point and decoding from there will cause the decoder to drop some frames.
 With multiple threads, only one frame will be dropped, with only one
 thread it will be about `max_num_reorder_frames` frames.

 ==== How to reproduce:
 Consider the following C code, compiled with latest libavformat/libavcodec
 (`8ca57fcf9ed327a6c2d9c5345be9b7e0724ca048`)
 {{{
 #include <libavformat/avformat.h>
 #include <libavcodec/avcodec.h>

 int main(int argc, char *argv[]) {
     if (argc != 5) {
         printf("Usage: %s <filename> <threads> <has_b_frames> <seek pts>",
 argv[0]);
         return -1;
     }

     char *filename = argv[1];
     int threads = atoi(argv[2]);
     int has_b_frames = atoi(argv[3]);
     int seek_pts = atoi(argv[4]);

     int ret = 0;
     AVFormatContext *format_ctx = NULL;

     ret = avformat_open_input(&format_ctx, filename, NULL, NULL);
     if (ret < 0) {
         printf("Failed to open file\n");
         return -1;
     }

     if (avformat_find_stream_info(format_ctx, NULL) < 0) {
         printf("Failed to find stream info\n");
         avformat_close_input(&format_ctx);
     }

     int video_stream = -1;
     for (size_t i = 0; i < format_ctx->nb_streams; i++) {
         if (format_ctx->streams[i]->codecpar->codec_type ==
 AVMEDIA_TYPE_VIDEO) {
             video_stream = i;
             break;
         }
     }

     av_seek_frame(format_ctx, video_stream, seek_pts, 0);

     printf("Opening stream %d\n", video_stream);

     const AVCodec *codec =
 avcodec_find_decoder(format_ctx->streams[video_stream]->codecpar->codec_id);
     AVCodecContext *codec_ctx = avcodec_alloc_context3(codec);

     ret = avcodec_parameters_to_context(codec_ctx,
 format_ctx->streams[video_stream]->codecpar);
     if (ret < 0) {
         printf("Failed to copy codec parameters\n");
         return -1;
     }

     codec_ctx->thread_count = threads;
     codec_ctx->has_b_frames = has_b_frames;

     ret = avcodec_open2(codec_ctx, codec, NULL);
     if (ret < 0) {
         printf("Failed to open decoder\n");
         return -1;
     }

     AVPacket *pkt = av_packet_alloc();
     int received_frames = 0;

     while (1) {
         AVFrame *frame = av_frame_alloc();
         int ret = avcodec_receive_frame(codec_ctx, frame);
         if (ret == 0) {
             printf("        Received frame %d with PTS=%ld \n",
 received_frames, frame->pts);
             received_frames++;
         } else if (ret == AVERROR(EAGAIN)) {
             printf("        EAGAIN when receiving frame\n");
         } else if (ret == AVERROR_EOF) {
             printf("        EOF\n");
             break;
         } else {
             return -1;
         }

         av_frame_free(&frame);

         while (1) {
             ret = av_read_frame(format_ctx, pkt);

             if (ret < 0 && ret != AVERROR_EOF)
                 return -1;
             if (ret == AVERROR_EOF || pkt->stream_index == video_stream)
                 break;

             av_packet_unref(pkt);
         }

         ret = avcodec_send_packet(codec_ctx, pkt);

         if (ret != AVERROR(EAGAIN)) {
             if (pkt->data) {
                 printf("Sent packet DTS=%ld PTS=%ld\n", pkt->dts,
 pkt->pts);
             } else {
                 printf("Sent empty packet\n");
             }
         } else {
             printf("Got EAGAIN when sending packet\n");
         }
         av_packet_unref(pkt);
     }

     av_packet_free(&pkt);
     avformat_close_input(&format_ctx);
     return 0;
 }
 }}}

 This just opens the given file, seeks to the given PTS, sets
 `codec_ctx->has_b_frames` (equivalently, one could change the
 `max_num_reorder_frames` field in the file's bitstream, but this is
 quicker) and the number of threads to the given value, and then runs a
 standard decoding loop.

 Also consider the attached file `interlaced_h264.mkv`. The file is an
 open-gop file with 126 frames and has a recovery point (among others) at
 PTS 3840 (frame 93 in decoding order, counting from 0, the last frame has
 PTS 5000). The value of `max_num_reorder_frames` in the bitstream is 2
 (but instead of repeatedly changing the value in the bitstream we simulate
 this here by setting `codec_ctx->has_b_frames`).

 Running `./myprogram interlaced_h264.mkv 16 2 0` shows that all frames are
 output as usual. When seeking to PTS 3840 with `./myprogram
 interlaced_h264.mkv 16 2 3840`, all frames after PTS 3840 are properly
 output, resulting in 29 output frames:
 {{{
 [...]
         Received frame 27 with PTS=4960
 Sent empty packet
         Received frame 28 with PTS=5000
 Sent empty packet
         EOF
 }}}
 However, when setting `has_b_frames=15` instead with `./myprogram
 interlaced_h264.mkv 16 15 3840`, a frame is skipped:
 {{{
 [...]
 Sent empty packet
         Received frame 14 with PTS=4400
 Sent empty packet
         Received frame 15 with PTS=4440
 Sent empty packet
         Received frame 16 with PTS=4520
 Sent empty packet
         Received frame 17 with PTS=4560
 [...]
         Received frame 28 with PTS=5000
 Sent empty packet
         EOF
 }}}
 We see that the frame with PTS 4480 is no longer output. The same happens
 with two threads (`./myprogram interlaced_h264.mkv 2 15 3840`). With just
 one thread (`./myprogram interlaced_h264.mkv 1 15 3840``), the decoder
 stops completely after frame 4440 and returns EOF after that:
 {{{
 [...]
         Received frame 14 with PTS=4400
 Sent empty packet
         Received frame 15 with PTS=4440
 Sent empty packet
         EOF
 }}}

 ==== Analysis:
 The problematic frame with PTS 4480 is skipped because its `recovered`
 value is 0, so it's not output by `finalize_frame` in `h264dec.c`.
 Normally, frames to be output are marked as `recovered` in
 `h264_select_output_frame` as soon as the `H264Context`'s
 `frame_recovered` field is set (which happens as soon as the recovery
 point's frame is output by `h264_select_output_frame`). They're also
 marked as `recovered` in `h264_field_start` when `h->frame_recovered` is
 set.
 However, the frame with PTS 4480 is decoded and added to the delayed
 picture buffer before the recovery point frame is output, so it's not set
 to `recovered` in `h264_field_start`. But the frame is only output during
 draining after all packets are consumed, so it's output with
 `send_next_delayed_frame` and not `h264_select_output_frame`. In
 `send_next_delayed_frame`, the frame is not set to `recovered`, so it's
 not output by `finalize_frame`. Depending on the threading logic, this
 either skips the frame or stops decoding entirely.

 So, the precise bug is probably that `send_next_delayed_frame` does not
 set `recovered`, but I'll leave discussing specific patches to the mailing
 list.
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/10936>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker