[FFmpeg-user] Create an AAC stream matching the Core Media Audio packet format / priming etc?

Mark Burton mwjburton at gmail.com
Fri Apr 14 18:57:06 EEST 2017


I appreciate this is a tricky area and there appear to be different ways that some encoders create AAC streams with regards to the padding and remaining samples etc. I won’t pretend to fully understand all the factors, but I would like to ask a genuine question which purely comes from the point of view of wanting to create a file for my working environment - an environment dominated by Quicktime 7 and X playback / decoding tools. I’m no great fan of Quicktime and appreciate its not well loved here also. In my industry it is still very much the defecto playback engine though, so if I’m able to tailor a file for this decoder, it would be an enormous help.

Let me also say, I am not accusing ffmpeg of having an issue, I have read a number of ‘bug’ reports surrounding ffmpeg and AAC priming and there being a sync discrepancy in the resultant encode when played back in certain decoders. I happen to see the exact same issue, but from some of the developers replies, I accept their position is that they feel ffmpeg is doing it the right way and its the decoders that are at fault.

With that said, I’d like to approach this question purely from the point of view of finding out whether there is a way to tweak a command in order to change this way the aac stream is created to produce an mp4 or mov file using the native aac encoder which decodes in Quicktime 7 or X, in sync. Currently an encoded file plays back 1 frame out of sync (audio is early by approx. 1 frame). In VLC its about 1/2 a frame out of sync.

The source file is a .mov, DNx115 24p (true 24p, not 23.976), PCM 24bit 48khz audio, which is in sync. This is film material where sync is crucial and always expected.

Here is the basic command to reproduce. I have attached the uncut loglevel 99 console output for this command:
ffmpeg -i SyncTest24p.mov -c:v libx264 -pix_fmt yuv420p -movflags faststart -c:a aac -b:a 128k ffmpeg.mp4

ffmpeg -i SyncTest24p.mov -c:v libx264 -pix_fmt yuv420p -movflags faststart -c:a aac -b:a 128k ffmpeg.mp4
ffmpeg version N-85343-gd0a3143-tessus Copyright (c) 2000-2017 the FFmpeg developers
  built with Apple LLVM version 8.0.0 (clang-800.0.42.1)
  configuration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl --enable-libass --enable-libbluray --enable-libfreetype --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopus --enable-libschroedinger --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzmq --enable-libzvbi --enable-version3 --disable-ffplay --disable-indev=qtkit
  libavutil      55. 60.100 / 55. 60.100
  libavcodec     57. 92.100 / 57. 92.100
  libavformat    57. 72.100 / 57. 72.100
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 84.100 /  6. 84.100
  libswscale      4.  7.100 /  4.  7.100
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'SyncTest24p.mov':
  Metadata:
    major_brand     : qt
    minor_version   : 537199360
    compatible_brands: qt
    creation_time   : 2017-04-14T13:37:46.000000Z
    timecode        : 01:00:00:00
  Duration: 00:00:02.00, start: 0.000000, bitrate: 118705 kb/s
    Stream #0:0(eng): Video: dnxhd (DNXHD) (AVdn / 0x6E645641), yuv422p(tv, bt709/unknown/unknown), 1920x1080, 116391 kb/s, SAR 1:1 DAR 16:9, 24 fps, 24 tbr, 24k tbn, 24k tbc (default)
    Metadata:
      creation_time   : 2017-04-14T13:37:46.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Avid DNxHD Codec
    Stream #0:1(eng): Audio: pcm_s24le (in24 / 0x34326E69), 48000 Hz, stereo, s32 (24 bit), 2304 kb/s (default)
    Metadata:
      creation_time   : 2017-04-14T13:37:46.000000Z
      handler_name    : Apple Alias Data Handler
    Stream #0:2(eng): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
    Metadata:
      creation_time   : 2017-04-14T13:37:46.000000Z
      handler_name    : Apple Alias Data Handler
      timecode        : 01:00:00:00
Stream mapping:
  Stream #0:0 -> #0:0 (dnxhd (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (pcm_s24le (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x7ff1ae809200] using SAR=1/1
[libx264 @ 0x7ff1ae809200] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[libx264 @ 0x7ff1ae809200] profile High, level 4.0
[libx264 @ 0x7ff1ae809200] 264 - core 148 - H.264/MPEG-4 AVC codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'ffmpeg.mp4':
  Metadata:
    major_brand     : qt
    minor_version   : 537199360
    compatible_brands: qt
    timecode        : 01:00:00:00
    encoder         : Lavf57.72.100
    Stream #0:0(eng): Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 24 fps, 12288 tbn, 24 tbc (default)
    Metadata:
      creation_time   : 2017-04-14T13:37:46.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Lavc57.92.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(eng): Audio: aac (LC) ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp (24 bit), 128 kb/s (default)
    Metadata:
      creation_time   : 2017-04-14T13:37:46.000000Z
      handler_name    : Apple Alias Data Handler
      encoder         : Lavc57.92.100 aac
[mp4 @ 0x7ff1ae81ec00] Starting second pass: moving the moov atom to the beginning of the file
frame=   48 fps=0.0 q=-1.0 Lsize=      24kB time=00:00:02.00 bitrate=  97.2kbits/s speed= 3.6x
video:17kB audio:3kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 15.142209%
[libx264 @ 0x7ff1ae809200] frame I:1     Avg QP:18.80  size:  4448
[libx264 @ 0x7ff1ae809200] frame P:14    Avg QP:16.87  size:   374
[libx264 @ 0x7ff1ae809200] frame B:33    Avg QP:23.31  size:   227
[libx264 @ 0x7ff1ae809200] consecutive B-frames:  2.1% 16.7%  6.2% 75.0%
[libx264 @ 0x7ff1ae809200] mb I  I16..4: 52.6% 46.2%  1.2%
[libx264 @ 0x7ff1ae809200] mb P  I16..4:  0.0%  0.0%  0.1%  P16..4:  0.2%  0.1%  0.0%  0.0%  0.0%    skip:99.6%
[libx264 @ 0x7ff1ae809200] mb B  I16..4:  0.0%  0.0%  0.0%  B16..8:  0.2%  0.1%  0.0%  direct: 0.0%  skip:99.7%  L0:60.9% L1:38.1% BI: 1.0%
[libx264 @ 0x7ff1ae809200] 8x8 transform intra:45.8% inter:14.5%
[libx264 @ 0x7ff1ae809200] coded y,uvDC,uvAC intra: 1.4% 0.0% 0.0% inter: 0.0% 0.0% 0.0%
[libx264 @ 0x7ff1ae809200] i16 v,h,dc,p: 97%  0%  3%  0%
[libx264 @ 0x7ff1ae809200] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 62%  2% 36%  0%  0%  0%  0%  0%  0%
[libx264 @ 0x7ff1ae809200] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 30% 16% 33%  5%  2%  4%  2%  6%  2%
[libx264 @ 0x7ff1ae809200] i8c dc,h,v,p: 100%  0%  0%  0%
[libx264 @ 0x7ff1ae809200] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7ff1ae809200] ref P L0: 70.5%  1.8% 16.3% 11.3%
[libx264 @ 0x7ff1ae809200] ref B L0: 58.5% 32.1%  9.4%
[libx264 @ 0x7ff1ae809200] ref B L1: 90.6%  9.4%
[libx264 @ 0x7ff1ae809200] kb/s:68.72
[aac @ 0x7ff1ae80aa00] Qavg: 61514.836

Attached is an ffprobe log file showing the audio packets for this ffmpeg encode.
Also, I ran the same source file through Apple Compressor v4.3.1 to the same output file spec and through YouTube and have attached ffprobe logs of their audio packets.

Apologies if I’ve not gone about this the right way, I’m not trying to demonstrate a bug, simply want to highlight the differences between the initial packets of these files and see if there is a way to produce a file via ffmpeg which uses a packet ‘setup’ similar to the Compressor or even YouTube way as this plays very nice with Quicktime decoders.

Perhaps the key here is that ffmpeg has a negative pts for the first packet and this is what Quicktime struggles with I would think. It also has SIDE DATA which Quicktime may also be struggling with, but I’ve not found much info on this and from my understanding Quicktime always assumes 2112 samples of delay at the start.
https://developer.apple.com/library/content/technotes/tn2258/_index.html

Start of ffmpeg file:

[PACKET]
codec_type=audio
stream_index=1
pts=-1024
pts_time=-0.021333
dts=-1024
dts_time=-0.021333
duration=1024
duration_time=0.021333
convergence_duration=N/A
convergence_duration_time=N/A
size=300
pos=8722
flags=KD
[SIDE_DATA]
side_data_type=Skip Samples
skip_samples=1024
discard_padding=0
skip_reason=0
discard_reason=0
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=audio
stream_index=1
pts=0
pts_time=0.000000
dts=0
dts_time=0.000000
duration=1024
duration_time=0.021333
convergence_duration=N/A
convergence_duration_time=N/A
size=410
pos=9351
flags=K_
[/PACKET]
...

Start of Compressor file:

[PACKET]
codec_type=audio
stream_index=0
pts=0
pts_time=0.000000
dts=0
dts_time=0.000000
duration=1024
duration_time=0.021333
convergence_duration=N/A
convergence_duration_time=N/A
size=6
pos=2320
flags=K_
[/PACKET]
...

Start of YouTube file:

[PACKET]
codec_type=audio
stream_index=1
pts=0
pts_time=0.000000
dts=0
dts_time=0.000000
duration=1024
duration_time=0.023220
convergence_duration=N/A
convergence_duration_time=N/A
size=557
pos=9043
flags=K_
[/PACKET]
…

Thank you if you have read this far, I really appreciate it. I’m slightly worried this is going to open a can of worms which will be above my pay grade!

Anyway, assuming I have provided enough info (and I can send anyone the source file if they want it), are there any options / flags etc I could add to the command to bring the encode in line with the Compressor or YouTube outputs in terms of the aac audio stream?

Many many thanks for any help,

Regards
Mark




-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: uncut console output loglevel 99.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-user/attachments/20170414/8d38721d/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ffprobe ffmpeg.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-user/attachments/20170414/8d38721d/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ffprobe Compressor.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-user/attachments/20170414/8d38721d/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ffprobe YouTube.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-user/attachments/20170414/8d38721d/attachment-0003.txt>


More information about the ffmpeg-user mailing list