[FFmpeg-trac] #4536(ffmpeg:new): mkv audio reencoding leads to nonuniform video timecodes
FFmpeg
trac at avcodec.org
Wed May 2 14:03:02 EEST 2018
#4536: mkv audio reencoding leads to nonuniform video timecodes
------------------------------------+----------------------------------
Reporter: sneaker | Owner:
Type: defect | Status: new
Priority: normal | Component: ffmpeg
Version: git-master | Resolution:
Keywords: | Blocked By:
Blocking: | Reproduced by developer: 0
Analyzed by developer: 0 |
------------------------------------+----------------------------------
Comment (by mkver):
1. I think I know why this is happening. The code for offsetting the
timestamps is in the write_packet function in libavformat/mux.c. The video
in the sample uses two reorder frames and therefore the first two video
packets have a dts that is smaller than the dts of the first audio packet;
there are therefore interleaved first and arrive first at write_packet
where no offset is set (because the pts of both video packets isn't
negative, so no shifting is required). But when the first audio packet
(with negative pts due to encoder delay) arrives, there needs to be a
shift which affects all following packets in coding order (which also
explains why MKVToolNix's timecode/timestamp files (which ignore coding
order) are unsuited for this). Here is mkvinfo's output for
ffmpeg_opus.mkv confirming what I just said:
{{{
Track 1: video, codec ID: V_MPEG4/ISO/AVC (h.264 profile: High @L3.1),
mkvmerge/mkvextract track ID: 0, language: und, default duration: 41.708ms
(23.976 frames/fields per second for a video track), pixel width: 1280,
pixel height: 720, display width: 1280, display height: 720
Track 2: audio, codec ID: A_OPUS, mkvmerge/mkvextract track ID: 1,
language: und, channels: 1, sampling freq: 48000, bits per sample: 16
I frame, track 1, timestamp 00:00:00.000000000, size 943, adler 0xd5581006
P frame, track 1, timestamp 00:00:00.167000000, size 40, adler 0x98d0052f
I frame, track 2, timestamp 00:00:00.000000000, size 3, adler 0x05e702f6
P frame, track 1, timestamp 00:00:00.090000000, size 37, adler 0x63b003e4
I frame, track 2, timestamp 00:00:00.021000000, size 3, adler 0x05e702f6
I frame, track 2, timestamp 00:00:00.041000000, size 3, adler 0x05e702f6
P frame, track 1, timestamp 00:00:00.049000000, size 37, adler 0x50a503b5
I frame, track 2, timestamp 00:00:00.061000000, size 3, adler 0x05e702f6
I frame, track 2, timestamp 00:00:00.081000000, size 3, adler 0x05e702f6
P frame, track 1, timestamp 00:00:00.132000000, size 37, adler 0x4fb803ae
...
}}}
If one encodes with the libfdk_aac encoder (which has 2048 samples encoder
delay which is longer than one frame at 24/1.001 fps), all video frames
except the very first one are offset.
2. This happens generally with encoder delay, it is not opus-specific.
Although Opus should be treated specially in this regard (the CodecDelay
header field already indicates the delay, using this header field and
baking the delay into the timestamps is wrong, but that is another issue).
3. Judging from this, I think that the decisions regarding delay should be
made before any packet is written so that packets from all tracks (or all
tracks for which packets are available at the beginning) can be
considered.
4. For a container like Matroska for which the offset decision is based
upon pts (by default) there is also another issue that could happen and
could be fixed by not making the decision about the offset in
write_packet: Just because the first packet of a track has a lower dts
than the first packet of another track does not mean that the first track
needs a bigger offset. That's because the difference of dts and pts can be
different for the tracks. An example: Imagine a video track with (say)
24/1.001 fps and two reorder frames whose first packet has a pts of -1 ms
(easily createable with -itsoffset). Then the first packet has a pts of -1
ms and a dts of about -84 ms. If one has e.g. an audio track whose first
packet has a pts=dts in between -1 ms and -84 ms (given the encoder delay,
this can easily happen), then the audio packet will still have a negative
pts after shifting. For Matroska this means that the file is against the
[https://matroska.org/technical/specs/notes.html specifications].
--
Ticket URL: <https://trac.ffmpeg.org/ticket/4536#comment:4>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
More information about the FFmpeg-trac
mailing list