[FFmpeg-user] fMP4: just bad, or the absolute worst?

Anton Kapela tkapela at gmail.com
Fri Aug 2 17:48:40 EEST 2024


To my ffmpeg brethren:

I'm trying to share some fun content with the world, specifically an
off-air feed from France 2 (which has the Olympics, bcast in UHD, 50 fps,
main 10, full-on BT2020 with PQ HDR, at ~18 mbits composite TS rate), and
am absolutely confounded. Hoping that someone here knows how fMP4 a/v
interleaving and sync is supposed to work.

Find the live fMP4 HLS here:
https://endpnt.com/hls/france2uhd/france2uhd.m3u8

When playing this HLS on iOS or macOS Safari, we get: frame
re-ordering (ie. "jumps") at what appear to be approximately fMP4 chunk
boundaries, audio is several seconds behind, despite audio and video
quickly decoding and playing upon starting the stream. Playing the same in
VLC, we start decoding frames immediately after downloading the first few
chunks, but no audio will play until several seconds have elapsed - so, no
a/v sync here. However, no inter-fragment frame reordering/jumps between
fragments.

Using MPV and ffplay, *things work* - a/v is sync'd, and doesn't drift when
playing the stream for arbitrarily long durations.

If I switch from fMP4 chunks to good ol' TS chunks, everything but Safari
plays just fine - a/v sync intact, etc. Of course, we know AAPL will never
support HEVC in .TS chunks, so I'd like to use fMP4.

In examining the mp4 init and the other mp4 segments produced by ffmpeg
(tried 4.3.7-old, 6.0.1, 7.0, 7.1, and git-current via JVS's "static
ffmpeg" builds on several different linux distributions), everything looks
kosher. We get the right ATOMs where expected (AFAICT), we are told about
frame durations in the codec, and the decoder can plainly see how many
samples are in each EAC3 packet as muxed in the mp4 fragments.

Things I've tried in adjusting the input/output of ffmpeg, which have not
resolved/changed anything regarding this "a/v desync on some players but
not all" issue:

-copyts, -reset_timestamps 1, -use_wallclock_as_timestamps, -async,
-fps_mode drop, -r 50, -re, etc.
-transcoding audio to aac, regular ac3, and mp3 (also at 44.1k rate vs. the
bcast source of 48khz)
-different segment lengths
-ss n (which doesn't do anything, as the stream is not seekable)
-fflags +nobuffer and various exhaustive combinations of -experimental low
latency options
-max_muxing_queue_size 9999 -max_interleave_delta 0 and other muxer-tweaks
-manually rewriting audio timestamps with setpts into the future (seems to
not carry over into fMP4, does seem to work in TS, but we don't want that)

Here's what's being ran, and what one could use to replicate this problem
on their system (assuming you have access and can sustain 18 mbits from my
.ts relay server, running icecast-kh):

ffmpeg -icy 0 -i http://endpnt.com:8000/france-2-uhd.ts -vcodec copy
-acodec copy -tag:v hvc1 -b:v 18M -map 0:i:3021 -map 0:i:3032
-hls_segment_type fmp4 -hls_init_time 4 -hls_list_size 10 -hls_time 4
-hls_flags iframes_only+independent_segments+delete_segments france2uhd.m3u8

No useful "errors" (other than normal AC3 implied vs. explicit sample
lengths) or cli output from ffmpeg exist here, as there's nothing to
report. I've included the last bit of input detection, fmp4 stuff firing
up, and then a little runtime output after this note in case anyone is
curious.

Clues welcome & appreciated.

Best,

-Tk

[hevc @ 0x55f6ac243700] Error parsing NAL unit #2.
[hevc @ 0x55f6ac243700] PPS id out of range: 0
    Last message repeated 1 times
[hevc @ 0x55f6ac243700] Error parsing NAL unit #2.
[hevc @ 0x55f6ac243700] PPS id out of range: 0
    Last message repeated 1 times
[hevc @ 0x55f6ac243700] Error parsing NAL unit #2.
[hevc @ 0x55f6ac243700] PPS id out of range: 0
    Last message repeated 1 times
[hevc @ 0x55f6ac243700] Error parsing NAL unit #2.
[hevc @ 0x55f6ac243700] PPS id out of range: 0
    Last message repeated 1 times
[hevc @ 0x55f6ac243700] Error parsing NAL unit #2.
Input #0, mpegts, from 'http://endpnt.com:8000/france-2-uhd.ts':
  Duration: N/A, start: 41795.243156, bitrate: N/A
  Program 2305
    Stream #0:0[0xbcd]: Video: hevc (Main 10) ([36][0][0][0] / 0x0024),
yuv420p10le(tv, bt2020nc/bt2020/smpte2084), 3840x2160 [SAR 1:1 DAR 16:9],
50 fps, 50 tbr, 90k tbn, 50 tbc
    Stream #0:1[0xbce]: Data: bin_data (AC-4 / 0x342D4341)
    Stream #0:2[0xbd8](fre): Audio: eac3 ([6][0][0][0] / 0x0006), 48000 Hz,
stereo, fltp, 128 kb/s
    Stream #0:3[0xbd9](qaa): Audio: eac3 ([6][0][0][0] / 0x0006), 48000 Hz,
stereo, fltp, 128 kb/s
    Stream #0:4[0xbda](fre): Audio: eac3 ([6][0][0][0] / 0x0006), 48000 Hz,
stereo, fltp, 128 kb/s (visual impaired) (descriptions)
    Stream #0:5[0xbdb](fre): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
(hearing impaired)
    Stream #0:6[0xbdc](fre): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
[hls @ 0x55f6ac2cae00] Opening 'init.mp4' for writing
[mp4 @ 0x55f6ac227a80] track 1: codec frame size is not set
Output #0, hls, to 'france2uhd.m3u8':
  Metadata:
    encoder         : Lavf58.45.100
    Stream #0:0: Video: hevc (Main 10) (hvc1 / 0x31637668), yuv420p10le(tv,
bt2020nc/bt2020/smpte2084), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 18000
kb/s, 50 fps, 50 tbr, 12800 tbn, 50 tbc
    Stream #0:1(fre): Audio: eac3 ([6][0][0][0] / 0x0006), 48000 Hz,
stereo, fltp, 128 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:2 -> #0:1 (copy)
Press [q] to stop, [?] for help
[hls @ 0x55f6ac2cae00] Opening 'france2uhd0.m4s' for writing=N/A
speed=2.11x
[hls @ 0x55f6ac2cae00] Opening 'france2uhd.m3u8.tmp' for writing
[hls @ 0x55f6ac2cae00] Opening 'france2uhd1.m4s' for writing=N/A
speed=1.55x
[hls @ 0x55f6ac2cae00] Opening 'france2uhd.m3u8.tmp' for writing
[hls @ 0x55f6ac2cae00] Opening 'france2uhd2.m4s' for writing=N/A
speed=1.34x
[hls @ 0x55f6ac2cae00] Opening 'france2uhd.m3u8.tmp' for writing
frame=  737 fps= 55 q=-1.0 size=N/A time=00:00:16.70 bitrate=N/A
speed=1.25x


More information about the ffmpeg-user mailing list