[FFmpeg-trac] #8631(undetermined:new): Audio gapless playback metadata for MP4/AAC

FFmpeg trac at avcodec.org
Fri Apr 24 07:53:00 EEST 2020


#8631: Audio gapless playback metadata for MP4/AAC
-------------------------------------+-------------------------------------
             Reporter:               |                     Type:  defect
  johnkaplantech                     |
               Status:  new          |                 Priority:  normal
            Component:               |                  Version:
  undetermined                       |  unspecified
             Keywords:  gapless,     |               Blocked By:
  audio, AAC                         |
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Background

 Gapless playback of audio tracks is enabled by many standard audio players
 in mobile devices. It allows successive audio tracks to play without pause
 or perceptible audio flaw as a seamless whole. Gapless playback is
 required for listeners to hear many live, classical, and classic rock
 recordings as intended. External sources describing gapless playback in
 detail include https://en.wikipedia.org/wiki/Gapless_playback and
 https://wiki.hydrogenaud.io/index.php?title=Gapless_playback.

 The above references describe several theoretical sources for gaps to be
 introduced among varying electronic audio formats, but the prevalent
 sources are lossy compression technologies such as MP3 and AAC, which
 introduce extra samples before and after the original PCM data of an audio
 track as a part of their encoding processes. Because the length of the
 extra data can vary, and metadata describing its length is not included in
 these compression standards, it cannot naturally be stripped away as a
 part of the decoding process.

 But the packaging technology can access the pertinent data from the audio
 encoder, and include it in file metadata for the audio players to access.
 This is what ffmpeg can do. The samples added to the front of the audio
 are called "delay" and the samples added to the end are called "padding."
 For the audio players to strip off the extra samples to get to a gapless
 audio track, pertinent values are the lengths in samples of: the delay,
 the original unpadded PCM audio samples, and the padding.

 As far as I know, there is no documented standard specifically for gapless
 playback metadata to be encoded in an audio file and interpreted by audio
 players. (If anyone has any inside information to the contrary, please
 include as a comment - I and a lot of others would be grateful for the
 insight.) But many audio players apparently follow de facto standards
 which involve reading metadata from the file headers that provide enough
 information about track length and delay to frame the original unpadded
 audio track.

 Proposed Solution
 The solution that this bug request proposes first for ffmpeg applies to
 AAC audio packaged in an MP4 file. The proposal is to adapt the
 moov/edts/elst atoms as described by iso14496-12 to add a single elst atom
 inside a single edts atom per track. Then inside this elst, to write the
 count of the unpadded audio PCM samples as the "track duration"/"time
 length" field, and the count of the delay samples as the "start
 time"/"media time" field. Audio players use these to skip over the delay
 samples within the provided track data, isolate the original PCM audio
 samples, and ignore the padding at the end, so the padding length is not
 explicitly included in the metadata. My team has experimented with audio
 tracks processed this way using the fdk-aac tool, and they play gaplessly
 on both iOS and Android standard audio players.

 Tech Details
 Here are some issues about the design & coding of this request. I'm hoping
 the community will jump in and comment to help me nail down the details so
 I can move on to coding a patch.

 As several of you have commented before, there is currently some code in
 ffmpeg that produces an elst atom, controlled by a command-line switch
 "use-editlist."

 I believe that this use of an edit list for movie synchronization is a
 different use case than its use for gapless audio. Of course if any of you
 can set me straight on how a single routine could cover both use cases,
 I'll attempt to satisfy that requirement. But short of that I propose to
 add a second command-line switch "gapless-editlist" that will peacefully
 co-exist, but be mutually exclusive with the current switch, and will
 control the emission of an elst atom with gapless metadata. Whether that's
 the ultimate shape of the best solution or not, at least for the time
 being it will avoid regressions for users relying on the current
 implementation.

 One detail I need help on is how to locate the sample lengths of the
 encoder delay and original PCM audio samples for an AAC encoding in the
 data available to the atom-writing functions in movenc.c. (i.e. something
 accessible from (AVIOContext *pb, MOVMuxContext *mov, MOVTrack *trac) )

 When we discussed previously, Martin Balint suggested I look in the side
 data AV_PKT_DATA_SKIP_SAMPLES for the delay value, and I found several
 references to that variable for different encoding formats. Unfortunately
 it seems to be used differently for each encoding format, and I don't know
 how to locate the same or equivalent data for AAC encoding. I'm also not
 sure of the interface between the fdk-aac module and ffmpeg. I will keep
 digging and hope I eventually find it, but if any maintainers out there
 have direct knowledge to point me to some code or a data structure
 definition, I'd be most grateful.

 As to a more general solution that works with other encoders, I'm game to
 help out with that once the MP4/AAC case is done. I'll keep
 experimenting/investigating while waiting for responses. I apologize for
 the long time since I discussed this before, but it's up to the top of my
 spare-time priority list now so I'm actively working it.

 Thanks,
 John

--
Ticket URL: <https://trac.ffmpeg.org/ticket/8631>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list