[FFmpeg-user] Non-monotonous DTS in output stream

Pieter Venter pietventer at gmail.com
Thu Feb 11 00:25:24 EET 2021


I've found out some new info, but worried about this top post etiquette.
Please help if I'm doing this wrong.

On Tue, Feb 9, 2021 at 7:19 PM Pieter Venter <pietventer at gmail.com> wrote:

> Hello,
>
> I'm trying to process some AVI files that originally came from a Sony
> Handycam to mp4 and managed to get that to work.
> However, a couple of files are giving me trouble and I've spent days
> trying to figure it out.
>
> I'm new here but have done my homework as best I could:
> * I downloaded the latest ffmpeg build I could find.
> * Searched the forums for solutions/switches (-af
> aresample=async=1, -fflags +igndts, -fflags +sortdts).
> * Tried other tools to extract the audio.
> * Past full command and output for your reference.
> * Will try to not "top post". That seems to be a thing here.
>
> If I run the following command, audio is processed correctly up to about
> the 43s mark, then becomes slower than expected (i.e. voices are deep,
> audio in "slow mo" but video plays normal).
>
> The original file plays correctly with VLC and Video on Linux. It does not
> play correctly with mplayer (same slowed down audio issue past 43 seconds).
>
> fmpeg -i input.avi -c:v libx264 -preset fast -crf 21 output.mp4
> ffmpeg version N-55863-g9f38fac053-static
> https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2021 the FFmpeg
> developers
>   built with gcc 8 (Debian 8.3.0-6)
>   configuration: --enable-gpl --enable-version3 --enable-static
> --disable-debug --disable-ffplay --disable-indev=sndio
> --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r
> --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom
> --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype
> --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
> --enable-libopenjpeg --enable-librubberband --enable-libsoxr
> --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus
> --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc
> --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265
> --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi
> --enable-libzimg
>   libavutil      56. 64.100 / 56. 64.100
>   libavcodec     58.119.100 / 58.119.100
>   libavformat    58. 65.101 / 58. 65.101
>   libavdevice    58. 11.103 / 58. 11.103
>   libavfilter     7.100.100 /  7.100.100
>   libswscale      5.  8.100 /  5.  8.100
>   libswresample   3.  8.100 /  3.  8.100
>   libpostproc    55.  8.100 / 55.  8.100
> [avi @ 0x620c340] Switching to NI mode, due to poor interleaving
> Input #0, avi, from 'input.avi':
>   Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s
>     Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3],
> 25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
>     Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
>     Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
> Stream mapping:
>   Stream #0:0 -> #0:0 (dvvideo (native) -> h264 (libx264))
>   Stream #0:1 -> #0:1 (pcm_s16le (native) -> aac (native))
> Press [q] to stop, [?] for help
> [libx264 @ 0x6233580] using SAR=16/15
> [libx264 @ 0x6233580] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
> AVX FMA3 BMI2 AVX2
> [libx264 @ 0x6233580] profile High, level 3.0, 4:2:0, 8-bit
> [libx264 @ 0x6233580] 264 - core 161 r3040 35417dc - H.264/MPEG-4 AVC
> codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options:
> cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1
> psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1
> cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12
> lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0
> bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1
> b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25
> scenecut=40 intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=21.0
> qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
> Output #0, mp4, to 'output.mp4':
>   Metadata:
>     encoder         : Lavf58.65.101
>     Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(bottom coded
> first (swapped)), 720x576 [SAR 16:15 DAR 4:3], q=2-31, 25 fps, 12800 tbn
>     Metadata:
>       encoder         : Lavc58.119.100 libx264
>     Side data:
>       cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
>     Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, stereo,
> fltp, 128 kb/s
>     Metadata:
>       encoder         : Lavc58.119.100 aac
> frame=10500 fps=116 q=-1.0 Lsize=  168876kB time=00:07:00.02
> bitrate=3293.7kbits/s speed=4.63x
> video:158898kB audio:9545kB subtitle:0kB other streams:0kB global
> headers:0kB muxing overhead: 0.257381%
> [libx264 @ 0x6233580] frame I:93    Avg QP:20.65  size: 54098
> [libx264 @ 0x6233580] frame P:3607  Avg QP:23.10  size: 22302
> [libx264 @ 0x6233580] frame B:6800  Avg QP:24.65  size: 11358
> [libx264 @ 0x6233580] consecutive B-frames: 13.4%  0.6%  0.7% 85.3%
> [libx264 @ 0x6233580] mb I  I16..4:  1.2% 98.2%  0.6%
> [libx264 @ 0x6233580] mb P  I16..4:  0.5% 32.9%  0.4%  P16..4: 35.4% 17.7%
>  9.8%  0.0%  0.0%    skip: 3.2%
> [libx264 @ 0x6233580] mb B  I16..4:  2.2% 25.7%  0.2%  B16..8: 22.8%  8.7%
>  0.7%  direct:28.9%  skip:10.8%  L0:40.2% L1:32.6% BI:27.2%
> [libx264 @ 0x6233580] 8x8 transform intra:93.9% inter:82.3%
> [libx264 @ 0x6233580] coded y,uvDC,uvAC intra: 82.4% 88.7% 30.9% inter:
> 40.1% 65.9% 1.2%
> [libx264 @ 0x6233580] i16 v,h,dc,p: 21% 19% 34% 26%
> [libx264 @ 0x6233580] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 14% 40%  5%  5%
>  5%  5%  6%  6%
> [libx264 @ 0x6233580] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu:  9% 54% 15%  4%  4%
>  3%  4%  3%  4%
> [libx264 @ 0x6233580] i8c dc,h,v,p: 59% 19% 19%  3%
> [libx264 @ 0x6233580] Weighted P-Frames: Y:8.2% UV:1.4%
> [libx264 @ 0x6233580] ref P L0: 56.9% 43.1%
> [libx264 @ 0x6233580] ref B L0: 74.0% 26.0%
> [libx264 @ 0x6233580] ref B L1: 95.3%  4.7%
> [libx264 @ 0x6233580] kb/s:3099.25
> [aac @ 0x6234fc0] Qavg: 189.772
>
> Trying to narrow down the problem area, I did the following - just encode
> up to the 43s mark and dump the audio.
> If I run it for the whole file, there are hundreds, if not thousands of
> entries like " Non-monotonous DTS..."
>
>  ffmpeg -t 00:00:43 -i  input.avi -map 0:a:0 -c:a aac output.avi
> ffmpeg version N-55863-g9f38fac053-static
> https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2021 the FFmpeg
> developers
>   built with gcc 8 (Debian 8.3.0-6)
>   configuration: --enable-gpl --enable-version3 --enable-static
> --disable-debug --disable-ffplay --disable-indev=sndio
> --disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r
> --enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom
> --enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype
> --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
> --enable-libopenjpeg --enable-librubberband --enable-libsoxr
> --enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus
> --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc
> --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265
> --enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi
> --enable-libzimg
>   libavutil      56. 64.100 / 56. 64.100
>   libavcodec     58.119.100 / 58.119.100
>   libavformat    58. 65.101 / 58. 65.101
>   libavdevice    58. 11.103 / 58. 11.103
>   libavfilter     7.100.100 /  7.100.100
>   libswscale      5.  8.100 /  5.  8.100
>   libswresample   3.  8.100 /  3.  8.100
>   libpostproc    55.  8.100 / 55.  8.100
> [avi @ 0x5b922c0] Switching to NI mode, due to poor interleaving
> Input #0, avi, from 'input.avi':
>   Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s
>     Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3],
> 25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
>     Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
>     Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
> File 'output.avi' already exists. Overwrite? [y/N] y
> Stream mapping:
>   Stream #0:1 -> #0:0 (pcm_s16le (native) -> aac (native))
> Press [q] to stop, [?] for help
> Output #0, avi, to 'output.avi':
>   Metadata:
>     ISFT            : Lavf58.65.101
>     Stream #0:0: Audio: aac (LC) ([255][0][0][0] / 0x00FF), 32000 Hz,
> stereo, fltp, 128 kb/s
>     Metadata:
>       encoder         : Lavc58.119.100 aac
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1341,
> current: 1339; changing to 1342. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1342,
> current: 1340; changing to 1343. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1343,
> current: 1340; changing to 1344. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1344,
> current: 1341; changing to 1345. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1345,
> current: 1342; changing to 1346. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1346,
> current: 1342; changing to 1347. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1347,
> current: 1343; changing to 1348. This may result in incorrect timestamps in
> the output file.
> [avi @ 0x5bb8040] Non-monotonous DTS in output stream 0:0; previous: 1348,
> current: 1343; changing to 1349. This may result in incorrect timestamps in
> the output file.
> size=     714kB time=00:00:43.16 bitrate= 135.6kbits/s speed=96.2x
> video:0kB audio:676kB subtitle:0kB other streams:0kB global headers:0kB
> muxing overhead: 5.592729%
> [aac @ 0x5bb9980] Qavg: 225.541
>
> The only solution I could find so far is to
> a. extract all the audio using ffmpeg (total audio file lengh is 10m09s)
> b. use audacity to carefully select the audio from 0m43s to the end and
> "shrink" it down to a total of 7m00s (the original file length)
> c. create a new video file from the processed audio stream
>
> Is there a way to troubleshoot, ignore, correct the Non-monotonous DTS
> error?
> Thanks for your help.
>
>
Using the -debug_ts switch, I can now see that the second audio channel
stops at 42.52

muxer <- type:audio pkt_pts:1060 pkt_pts_time:42.4 pkt_dts:1060
pkt_dts_time:42.4 size:5120
demuxer -> ist_index:1 type:audio next_dts:42400000 next_dts_time:42.4
next_pts:42400000 next_pts_time:42.4 pkt_pts:1061 pkt_pts_time:42.44
pkt_dts:1061 pkt_dts_time:42.44 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1061 pkt_pts_time:42.44
pkt_dts:1061 pkt_dts_time:42.44 off:0 off_time:0
muxer <- type:audio pkt_pts:1061 pkt_pts_time:42.44 pkt_dts:1061
pkt_dts_time:42.44 size:5120
demuxer -> ist_index:2 type:audio next_dts:42400000 next_dts_time:42.4
next_pts:42400000 next_pts_time:42.4 pkt_pts:1061 pkt_pts_time:42.44
pkt_dts:1061 pkt_dts_time:42.44 off:0 off_time:0
demuxer+ffmpeg -> ist_index:2 type:audio pkt_pts:1061 pkt_pts_time:42.44
pkt_dts:1061 pkt_dts_time:42.44 off:0 off_time:0
muxer <- type:audio pkt_pts:1061 pkt_pts_time:42.44 pkt_dts:1061
pkt_dts_time:42.44 size:5120
demuxer -> ist_index:1 type:audio next_dts:42440000 next_dts_time:42.44
next_pts:42440000 next_pts_time:42.44 pkt_pts:1062 pkt_pts_time:42.48
pkt_dts:1062 pkt_dts_time:42.48 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1062 pkt_pts_time:42.48
pkt_dts:1062 pkt_dts_time:42.48 off:0 off_time:0
muxer <- type:audio pkt_pts:1062 pkt_pts_time:42.48 pkt_dts:1062
pkt_dts_time:42.48 size:5120
demuxer -> ist_index:2 type:audio next_dts:42440000 next_dts_time:42.44
next_pts:42440000 next_pts_time:42.44 pkt_pts:1062 pkt_pts_time:42.48
pkt_dts:1062 pkt_dts_time:42.48 off:0 off_time:0
demuxer+ffmpeg -> ist_index:2 type:audio pkt_pts:1062 pkt_pts_time:42.48
pkt_dts:1062 pkt_dts_time:42.48 off:0 off_time:0
muxer <- type:audio pkt_pts:1062 pkt_pts_time:42.48 pkt_dts:1062
pkt_dts_time:42.48 size:5120
demuxer -> ist_index:1 type:audio next_dts:42480000 next_dts_time:42.48
next_pts:42480000 next_pts_time:42.48 pkt_pts:1063 pkt_pts_time:42.52
pkt_dts:1063 pkt_dts_time:42.52 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1063 pkt_pts_time:42.52
pkt_dts:1063 pkt_dts_time:42.52 off:0 off_time:0
muxer <- type:audio pkt_pts:1063 pkt_pts_time:42.52 pkt_dts:1063
pkt_dts_time:42.52 size:5120
demuxer -> ist_index:2 type:audio next_dts:42480000 next_dts_time:42.48
next_pts:42480000 next_pts_time:42.48 pkt_pts:1063 pkt_pts_time:42.52
pkt_dts:1063 pkt_dts_time:42.52 off:0 off_time:0
demuxer+ffmpeg -> ist_index:2 type:audio pkt_pts:1063 pkt_pts_time:42.52
pkt_dts:1063 pkt_dts_time:42.52 off:0 off_time:0

>From this point forward, no more channel 2 for audio

muxer <- type:audio pkt_pts:1063 pkt_pts_time:42.52 pkt_dts:1063
pkt_dts_time:42.52 size:5120
demuxer -> ist_index:1 type:audio next_dts:42520000 next_dts_time:42.52
next_pts:42520000 next_pts_time:42.52 pkt_pts:1064 pkt_pts_time:42.56
pkt_dts:1064 pkt_dts_time:42.56 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1064 pkt_pts_time:42.56
pkt_dts:1064 pkt_dts_time:42.56 off:0 off_time:0
muxer <- type:audio pkt_pts:1064 pkt_pts_time:42.56 pkt_dts:1064
pkt_dts_time:42.56 size:7680
demuxer -> ist_index:1 type:audio next_dts:42560000 next_dts_time:42.56
next_pts:42560000 next_pts_time:42.56 pkt_pts:1065 pkt_pts_time:42.6
pkt_dts:1065 pkt_dts_time:42.6 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1065 pkt_pts_time:42.6
pkt_dts:1065 pkt_dts_time:42.6 off:0 off_time:0
muxer <- type:audio pkt_pts:1065 pkt_pts_time:42.6 pkt_dts:1065
pkt_dts_time:42.6 size:7680
demuxer -> ist_index:1 type:audio next_dts:42600000 next_dts_time:42.6
next_pts:42600000 next_pts_time:42.6 pkt_pts:1066 pkt_pts_time:42.64
pkt_dts:1066 pkt_dts_time:42.64 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1066 pkt_pts_time:42.64
pkt_dts:1066 pkt_dts_time:42.64 off:0 off_time:0
muxer <- type:audio pkt_pts:1066 pkt_pts_time:42.64 pkt_dts:1066
pkt_dts_time:42.64 size:7680
demuxer -> ist_index:1 type:audio next_dts:42640000 next_dts_time:42.64
next_pts:42640000 next_pts_time:42.64 pkt_pts:1067 pkt_pts_time:42.68
pkt_dts:1067 pkt_dts_time:42.68 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1067 pkt_pts_time:42.68
pkt_dts:1067 pkt_dts_time:42.68 off:0 off_time:0
muxer <- type:audio pkt_pts:1067 pkt_pts_time:42.68 pkt_dts:1067
pkt_dts_time:42.68 size:7680
demuxer -> ist_index:1 type:audio next_dts:42680000 next_dts_time:42.68
next_pts:42680000 next_pts_time:42.68 pkt_pts:1068 pkt_pts_time:42.72
pkt_dts:1068 pkt_dts_time:42.72 off:0 off_time:0
demuxer+ffmpeg -> ist_index:1 type:audio pkt_pts:1068 pkt_pts_time:42.72
pkt_dts:1068 pkt_dts_time:42.72 off:0 off_time:0

I've tried just excluding that audio stream, but the remaining audio is
still weird

 ffmpeg -i input.avi  -map 0 -map -0:a:1 -c copy output.avi
ffmpeg version N-55863-g9f38fac053-static https://johnvansickle.com/ffmpeg/
 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 8 (Debian 8.3.0-6)
  configuration: --enable-gpl --enable-version3 --enable-static
--disable-debug --disable-ffplay --disable-indev=sndio
--disable-outdev=sndio --cc=gcc --enable-fontconfig --enable-frei0r
--enable-gnutls --enable-gmp --enable-libgme --enable-gray --enable-libaom
--enable-libfribidi --enable-libass --enable-libvmaf --enable-libfreetype
--enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb
--enable-libopenjpeg --enable-librubberband --enable-libsoxr
--enable-libspeex --enable-libsrt --enable-libvorbis --enable-libopus
--enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc
--enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265
--enable-libxml2 --enable-libdav1d --enable-libxvid --enable-libzvbi
--enable-libzimg
  libavutil      56. 64.100 / 56. 64.100
  libavcodec     58.119.100 / 58.119.100
  libavformat    58. 65.101 / 58. 65.101
  libavdevice    58. 11.103 / 58. 11.103
  libavfilter     7.100.100 /  7.100.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
  libpostproc    55.  8.100 / 55.  8.100
[avi @ 0x599e300] Switching to NI mode, due to poor interleaving
Input #0, avi, from 'input.avi':
  Duration: 00:07:00.00, start: 0.000000, bitrate: 28806 kb/s
    Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3],
25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
    Stream #0:2: Audio: pcm_s16le, 32000 Hz, stereo, s16, 1024 kb/s
File 'output.avi' already exists. Overwrite? [y/N] y
Output #0, avi, to 'output.avi':
  Metadata:
    ISFT            : Lavf58.65.101
    Stream #0:0: Video: dvvideo (dvsd / 0x64737664), yuv420p, 720x576 [SAR
16:15 DAR 4:3], q=2-31, 25000 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
    Stream #0:1: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz,
stereo, s16, 1024 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=10500 fps=8868 q=-1.0 Lsize= 1553212kB time=00:07:00.00
bitrate=30295.0kbits/s speed= 355x
video:1476562kB audio:76090kB subtitle:0kB other streams:0kB global
headers:0kB muxing overhead: 0.036054%

Any advice would be welcome, thanks.


More information about the ffmpeg-user mailing list