[FFmpeg-user] Create an AAC stream matching the Core Media Audio packet format / priming etc?

Mark Burton mwjburton at gmail.com
Mon Jun 5 02:46:00 EEST 2017


On 26 May 2017, at 12:53, Christian Ebert <blacktrash at gmx.net <mailto:blacktrash at gmx.net>> wrote:
> Yeah, filtering does not add an edit list (apparently the
> 'modern' solution):
> https://developer.apple.com/library/content/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html#//apple_ref/doc/uid/TP40000939-CH204-25592 <https://developer.apple.com/library/content/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html#//apple_ref/doc/uid/TP40000939-CH204-25592>
> and therefore QuickTime fails over to a hardcoded 'historical'
> default. That explains it.

Further to what you’ve pointed out here, I looked at the encoded file with Atom Inspector after having used the below audio filter command:

-af aresample=async=1:first_pts=0,asetpts=PTS-STARTPTS+2112

Atom Inspector reveals there are just 3 values which need changing to achieve a file which decodes in Quicktime with perfect sync and perfect duration. Using the AudioToolbox AAC encoder is again more successful, but it can be done with the native AAC encoder also, with one extra step (removing the first entry in the sound media Edit List).

The locations of the values which need to be altered are:

IN THE MOVIE HEADER
moov - Movie  //  mvhd - Movie Header  //  ‘duration'

IN THE SOUND MEDIA TRACK
trak - Track  //  tkhd - Track Header  //  ‘duration’
trak - Track  //  edts - Edits  //  elst - Edit List  //  'segment duration’


When these values are changed to match the video track duration, the file is ‘fixed’ so to speak - in Quicktime based decoders it plays in perfect sync, ends exactly where its supposed to end and has a duration which matches the source. Its the first file with AAC audio I’ve had from ffmpeg which works in a Quicktime decoder. Obviously this is not a viable solution unless there is a way to do this programatically, its cumbersome and, I’m sure, a bit of a hack.

However it does appear to prove your point that the current method of ffmpeg mov file muxing is producing a file which fails to use "a complete implementation using the sample group structures” for AAC tracks which therefore results in the decoder falling back to the historical technique to handle AAC timing and synchronisation and hence the sync issue.

If anyone is still monitoring this thread, please can you take another look at this? It appears there may be two ways to address this.

1 - Alter the mov muxer to create a file which correctly conforms to "a complete implementation using the sample group structures” as set out here:
https://developer.apple.com/library/content/documentation/QuickTime/QTFF/QTFFAppenG/QTFFAppenG.html#//apple_ref/doc/uid/TP40000939-CH2-SW11 <https://developer.apple.com/library/content/documentation/QuickTime/QTFF/QTFFAppenG/QTFFAppenG.html#//apple_ref/doc/uid/TP40000939-CH2-SW11>

2 - Alternatively, is there a way to programatically mux the mov file to record a duration which matches the source duration exactly?

Thanks
Mark


Command used in this example:
———————————————
$ ffmpeg -v info -i input.mov -c:v libx264 -pix_fmt yuv420p -c:a aac_at -b:a 128k -af aresample=async=1:first_pts=0,asetpts=PTS-STARTPTS+2112 output.mov
ffmpeg version N-86344-gb5a0971-tessus Copyright (c) 2000-2017 the FFmpeg developers
  built with Apple LLVM version 8.0.0 (clang-800.0.42.1)
  configuration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl --enable-libass --enable-libbluray --enable-libfreetype --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzmq --enable-libzvbi --enable-version3 --disable-ffplay --disable-indev=qtkit
  libavutil      55. 63.100 / 55. 63.100
  libavcodec     57. 96.101 / 57. 96.101
  libavformat    57. 72.101 / 57. 72.101
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 91.100 /  6. 91.100
  libswscale      4.  7.101 /  4.  7.101
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 537199360
    compatible_brands: qt  
    creation_time   : 2016-10-15T16:52:53.000000Z
    timecode        : 01:00:00:00
  Duration: 00:01:00.00, start: 0.000000, bitrate: 118696 kb/s
    Stream #0:0(eng): Video: dnxhd (DNXHD) (AVdn / 0x6E645641), yuv422p(tv, bt709/unknown/unknown), 1920x1080, 116391 kb/s, SAR 1:1 DAR 16:9, 24 fps, 24 tbr, 24k tbn, 24k tbc (default)
    Metadata:
      creation_time   : 2016-10-15T16:52:53.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Avid DNxHD Codec
    Stream #0:1(eng): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
    Metadata:
      creation_time   : 2016-10-15T16:52:53.000000Z
      handler_name    : Apple Alias Data Handler
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74) (default)
    Metadata:
      creation_time   : 2016-10-15T16:52:55.000000Z
      handler_name    : Apple Alias Data Handler
      timecode        : 01:00:00:00
Stream mapping:
  Stream #0:0 -> #0:0 (dnxhd (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (pcm_s24le (native) -> aac (aac_at))
Press [q] to stop, [?] for help
[libx264 @ 0x7f86e4008c00] using SAR=1/1
[libx264 @ 0x7f86e4008c00] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[libx264 @ 0x7f86e4008c00] profile High, level 4.0
[libx264 @ 0x7f86e4008c00] 264 - core 148 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mov, to 'output.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 537199360
    compatible_brands: qt  
    timecode        : 01:00:00:00
    encoder         : Lavf57.72.101
    Stream #0:0(eng): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 24 fps, 12288 tbn, 24 tbc (default)
    Metadata:
      creation_time   : 2016-10-15T16:52:53.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Lavc57.96.101 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(eng): Audio: aac (aac_at) (mp4a / 0x6134706D), 48000 Hz, stereo, s16, 128 kb/s (default)
    Metadata:
      creation_time   : 2016-10-15T16:52:53.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Lavc57.96.101 aac_at
frame= 1440 fps=172 q=-1.0 Lsize=    1400kB time=00:01:00.07 bitrate= 191.0kbits/s speed=7.16x    
video:419kB audio:939kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.143227%
[libx264 @ 0x7f86e4008c00] frame I:6     Avg QP:11.83  size:  4424
[libx264 @ 0x7f86e4008c00] frame P:423   Avg QP:16.20  size:   437
[libx264 @ 0x7f86e4008c00] frame B:1011  Avg QP:22.64  size:   215
[libx264 @ 0x7f86e4008c00] consecutive B-frames:  0.8% 16.8%  0.2% 82.2%
[libx264 @ 0x7f86e4008c00] mb I  I16..4: 68.6% 30.3%  1.2%
[libx264 @ 0x7f86e4008c00] mb P  I16..4:  0.0%  0.0%  0.1%  P16..4:  0.2%  0.1%  0.0%  0.0%  0.0%    skip:99.6%
[libx264 @ 0x7f86e4008c00] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  0.1%  0.1%  0.0%  direct: 0.0%  skip:99.7%  L0:63.6% L1:34.9% BI: 1.5%
[libx264 @ 0x7f86e4008c00] 8x8 transform intra:31.2% inter:10.4%
[libx264 @ 0x7f86e4008c00] coded y,uvDC,uvAC intra: 4.0% 0.1% 0.0% inter: 0.0% 0.0% 0.0%
[libx264 @ 0x7f86e4008c00] i16 v,h,dc,p: 96%  1%  2%  0%
[libx264 @ 0x7f86e4008c00] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 55%  3% 42%  0%  0%  0%  0%  0%  0%
[libx264 @ 0x7f86e4008c00] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 19% 35%  5%  2%  3%  2%  6%  2%
[libx264 @ 0x7f86e4008c00] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0x7f86e4008c00] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7f86e4008c00] ref P L0: 69.7%  2.2% 16.0% 12.1%
[libx264 @ 0x7f86e4008c00] ref B L0: 64.5% 29.6%  5.9%
[libx264 @ 0x7f86e4008c00] ref B L1: 87.2% 12.8%
[libx264 @ 0x7f86e4008c00] kb/s:57.13


More information about the ffmpeg-user mailing list