[FFmpeg-devel] Designing a subtitles infrastructure (was: DVBSUBS and TXTSUBS)

Fri May 25 16:01:47 CEST 2012

Nicolas George <nicolas.george at normalesup.org> writes:

> That would be nice. As I said in the Trac ticket you were pointing at, the
> hard part with subtitles is that they are a sparse stream. For example, take
> the following SRT snippet:
>
> 1
> 00:00:01,000 --> 00:00:02,000
> This is the beginning.
>
> 1
> 00:01:01,000 --> 00:01:02,000
> This is the end.
>
> Mux it with a big video, and then:
>
> ./ffmpeg_g -i video+sub.mkv -i audio.ogg \
>   -f matroska -vcodec libx264 -preset ultrafast -crf 0 -y /dev/null
>
> ... and watch ffmpeg eating all the memory.
>
> The reason for that is this: ffmpeg always want a packet from the oldest
> stream. After two seconds of encoding, the oldest stream is the subtitles
> stream, so it will read from the video+sub file, and get a video packet. The
> video packets will be stuck in the muxing infrastructure waiting for audio
> packet for proper interleaving on output. Unfortunately, the sub stream is
> still the oldest, so ffmpeg does not try to read on the audio file.

Please check libavformat/utils.c:ff_interleave_packet_per_dts().
We already have a threshold (delta_dts_max < 20 seconds) for
noninterleaved streams.

Why isn't this working in your case?  Is it because that mechanism is
per input context, and you have two of them?

Regards,
Wolfram.