[FFmpeg-trac] #8585(undetermined:new): Silence/volume problem when converting ATRAC files

FFmpeg trac at avcodec.org
Fri Mar 27 06:07:23 EET 2020


#8585: Silence/volume problem when converting ATRAC files
-------------------------------------+-------------------------------------
             Reporter:  lukpac       |                     Type:  defect
               Status:  new          |                 Priority:  normal
            Component:               |                  Version:
  undetermined                       |  unspecified
             Keywords:  atrac        |               Blocked By:
  atrac1 atrac3plus                  |
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 FFmpeg seems to be inserting what amount to fade-ups at the start of
 ATRAC1 and ATRAC3plus files (and possibly others, but those are all I have
 tested).

 For ATRAC3plus, I transferred a file from MiniDisc using Sony's SonicStage
 software (v 4.3.01.14050). This converts the original ATRAC1 data on the
 disc itself to ATRAC3plus. In addition, SonicStage will automatically
 convert to WAV after transferring. While the WAVs created by SonicStage
 seem to be fine, converting the OMA/ATRAC3plus file using FFmpeg results
 in a fade-up from silence of approximately 0.05 seconds at the start of
 the file.

 Here is the relevant conversion information:

 C:\Program Files\ffmpeg-20200324-e5d25d1-win64-static\bin>ffmpeg.exe -i
 "F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms.oma" -loglevel 99
 "F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms-ffmpeg.wav"
 ffmpeg version git-2020-03-24-e5d25d1 Copyright (c) 2000-2020 the FFmpeg
 developers
   built with gcc 9.2.1 (GCC) 20200122
   configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-
 fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-
 libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame
 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
 --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr
 --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx
 --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265
 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp
 --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-
 libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-
 libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-
 d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
 --enable-libopenmpt --enable-amf
   libavutil      56. 42.101 / 56. 42.101
   libavcodec     58. 76.100 / 58. 76.100
   libavformat    58. 42.100 / 58. 42.100
   libavdevice    58.  9.103 / 58.  9.103
   libavfilter     7. 77.100 /  7. 77.100
   libswscale      5.  6.101 /  5.  6.101
   libswresample   3.  6.100 /  3.  6.100
   libpostproc    55.  6.100 / 55.  6.100
 Splitting the commandline.
 Reading option '-i' ... matched as input url with argument
 'F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms.oma'.
 Reading option '-loglevel' ... matched as option 'loglevel' (set logging
 level) with argument '99'.
 Reading option 'F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms-
 ffmpeg.wav' ... matched as output url.
 Finished splitting the commandline.
 Parsing a group of options: global .
 Applying option loglevel (set logging level) with argument 99.
 Successfully parsed a group of options.
 Parsing a group of options: input url F:\ffmpeg_test\002-Roll In My Sweet
 Baby's Arms.oma.
 Successfully parsed a group of options.
 Opening an input file: F:\ffmpeg_test\002-Roll In My Sweet Baby's
 Arms.oma.
 [NULL @ 00000176298cb0c0] Opening 'F:\ffmpeg_test\002-Roll In My Sweet
 Baby's Arms.oma' for reading
 [file @ 00000176298cc280] Setting default whitelist 'file,crypto,data'
 Probing oma score:25 size:2048
 Probing oma score:100 size:4096
 [oma @ 00000176298cb0c0] Format oma probed with size=4096 and score=100
 [oma @ 00000176298cb0c0] id3v2 ver:3 flags:00 len:3062
 [oma @ 00000176298cb0c0] File is encrypted
 [oma @ 00000176298cb0c0] RID: 0001001d
 [oma @ 00000176298cb0c0] IV: 152f24153b47eb8c
 [oma @ 00000176298cb0c0] CBC-MAC: b3adf37070e669ab
 [oma @ 00000176298cb0c0] EK: 141ca30f90f12ef9
 [oma @ 00000176298cb0c0] Before avformat_find_stream_info() pos: 3168
 bytes read:32768 seeks:0 nb_streams:1
 [oma @ 00000176298cb0c0] All info found
 [oma @ 00000176298cb0c0] Estimating duration from bitrate, this may be
 inaccurate
 [oma @ 00000176298cb0c0] stream 0: start_time: 0.000 duration: 236.333
 [oma @ 00000176298cb0c0] format: start_time: 0.000 duration: 236.333
 (estimate from bit rate) bitrate=256 kb/s
 [oma @ 00000176298cb0c0] After avformat_find_stream_info() pos: 4656 bytes
 read:32768 seeks:0 frames:1
 Input #0, oma, from 'F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms.oma':
   Metadata:
     title           : Roll In My Sweet Baby's Arms
     album           : 20051129 set 1 Lonesome Rogues
     OMG_ALBMS       : 20051129 set 1 Lonesome Rogues
     OMG_TIT2S       : Roll In My Sweet Baby's Arms
   Duration: 00:03:56.33, start: 0.000000, bitrate: 256 kb/s
     Stream #0:0, 1, 1/44100: Audio: atrac3p ([1][0][0][0] / 0x0001), 44100
 Hz, stereo, fltp, 256 kb/s
 Successfully opened the file.
 Parsing a group of options: output url F:\ffmpeg_test\002-Roll In My Sweet
 Baby's Arms-ffmpeg.wav.
 Successfully parsed a group of options.
 Opening an output file: F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms-
 ffmpeg.wav.
 [file @ 00000176298cd780] Setting default whitelist 'file,crypto,data'
 Successfully opened the file.
 Stream mapping:
   Stream #0:0 -> #0:0 (atrac3p (atrac3plus) -> pcm_s16le (native))
 Press [q] to stop, [?] for help
 cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless
 if it occurs once at the start per stream)
 detected 8 logical cores
 [graph_0_in_0_0 @ 00000176298f7a80] Setting 'time_base' to value '1/44100'
 [graph_0_in_0_0 @ 00000176298f7a80] Setting 'sample_rate' to value '44100'
 [graph_0_in_0_0 @ 00000176298f7a80] Setting 'sample_fmt' to value 'fltp'
 [graph_0_in_0_0 @ 00000176298f7a80] Setting 'channel_layout' to value
 '0x3'
 [graph_0_in_0_0 @ 00000176298f7a80] tb:1/44100 samplefmt:fltp
 samplerate:44100 chlayout:0x3
 [format_out_0_0 @ 000001762993c280] Setting 'sample_fmts' to value 's16'
 [format_out_0_0 @ 000001762993c280] auto-inserting filter
 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter
 'format_out_0_0'
 [AVFilterGraph @ 00000176298f7600] query_formats: 4 queried, 6 merged, 3
 already done, 0 delayed
 [auto_resampler_0 @ 000001762993c880] [SWR @ 000001762993e440] Using fltp
 internally between filters
 [auto_resampler_0 @ 000001762993c880] ch:2 chl:stereo fmt:fltp r:44100Hz
 -> ch:2 chl:stereo fmt:s16 r:44100Hz
 Output #0, wav, to 'F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms-
 ffmpeg.wav':
   Metadata:
     INAM            : Roll In My Sweet Baby's Arms
     IPRD            : 20051129 set 1 Lonesome Rogues
     OMG_ALBMS       : 20051129 set 1 Lonesome Rogues
     OMG_TIT2S       : Roll In My Sweet Baby's Arms
     ISFT            : Lavf58.42.100
     Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
 44100 Hz, stereo, s16, 1411 kb/s
     Metadata:
       encoder         : Lavc58.76.100 pcm_s16le
 [out_0_0 @ 00000176298d3300] EOF on sink link out_0_0:default.341x
 No more output streams to write to, finishing.
 size=   40712kB time=00:03:56.33 bitrate=1411.2kbits/s speed= 363x
 video:0kB audio:40712kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: 0.000374%
 Input file #0 (F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms.oma):
   Input stream #0:0 (audio): 5089 packets read (7572432 bytes); 5089
 frames decoded (10422272 samples);
   Total: 5089 packets (7572432 bytes) demuxed
 Output file #0 (F:\ffmpeg_test\002-Roll In My Sweet Baby's Arms-
 ffmpeg.wav):
   Output stream #0:0 (audio): 5089 frames encoded (10422272 samples); 5089
 packets muxed (41689088 bytes);
   Total: 5089 packets (41689088 bytes) muxed
 5089 frames successfully decoded, 0 decoding errors
 [AVIOContext @ 000001762993b000] Statistics: 4 seeks, 162 writeouts
 [AVIOContext @ 00000176298d4500] Statistics: 7575600 bytes read, 0 seeks

 Other than the fade-up, the resulting WAVs are essentially the same
 whether converted using FFmpeg or SonicStage, although they are not
 digitally identical (even after the fade-up).

 I have also been able to transfer the raw ATRAC1 data from MiniDisc using
 QHiMDTransfer. While I am unable to convert these to WAV using SonicStudio
 to provide a baseline, the same fade-up behavior seems to be present. Here
 is the relevant conversion information:

 C:\Program Files\ffmpeg-20200324-e5d25d1-win64-static\bin>ffmpeg -v 9
 -loglevel 99 -i "F:\Roll In My Sweet Baby's Arms.aea" "F:\ffmpeg_test\Roll
 In My Sweet Baby's Arms-ffmpeg.wav"
 ffmpeg version git-2020-03-24-e5d25d1 Copyright (c) 2000-2020 the FFmpeg
 developers
   built with gcc 9.2.1 (GCC) 20200122
   configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-
 fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-
 libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame
 --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg
 --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr
 --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx
 --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265
 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp
 --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-
 libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-
 libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-
 d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
 --enable-libopenmpt --enable-amf
   libavutil      56. 42.101 / 56. 42.101
   libavcodec     58. 76.100 / 58. 76.100
   libavformat    58. 42.100 / 58. 42.100
   libavdevice    58.  9.103 / 58.  9.103
   libavfilter     7. 77.100 /  7. 77.100
   libswscale      5.  6.101 /  5.  6.101
   libswresample   3.  6.100 /  3.  6.100
   libpostproc    55.  6.100 / 55.  6.100
 Splitting the commandline.
 Reading option '-v' ... matched as option 'v' (set logging level) with
 argument '9'.
 Reading option '-loglevel' ... matched as option 'loglevel' (set logging
 level) with argument '99'.
 Reading option '-i' ... matched as input url with argument 'F:\Roll In My
 Sweet Baby's Arms.aea'.
 Reading option 'F:\ffmpeg_test\Roll In My Sweet Baby's Arms-ffmpeg.wav'
 ... matched as output url.
 Finished splitting the commandline.
 Parsing a group of options: global .
 Applying option v (set logging level) with argument 9.
 Successfully parsed a group of options.
 Parsing a group of options: input url F:\Roll In My Sweet Baby's Arms.aea.
 Successfully parsed a group of options.
 Opening an input file: F:\Roll In My Sweet Baby's Arms.aea.
 [NULL @ 000001e7d954b0c0] Opening 'F:\Roll In My Sweet Baby's Arms.aea'
 for reading
 [file @ 000001e7d954c240] Setting default whitelist 'file,crypto,data'
 Probing mp3 score:12 size:2048
 [aea @ 000001e7d954b0c0] Format aea detected only with low score of 1,
 misdetection possible!
 id3v2 ver:3 flags:00 len:2101
 [aea @ 000001e7d954b0c0] Before avformat_find_stream_info() pos: 4159
 bytes read:1048576 seeks:0 nb_streams:1
 [aea @ 000001e7d954b0c0] All info found
 [aea @ 000001e7d954b0c0] Estimating duration from bitrate, this may be
 inaccurate
 [aea @ 000001e7d954b0c0] stream 0: start_time: -102481911520608.625
 duration: 236.464
 [aea @ 000001e7d954b0c0] format: start_time: -9223372036854.775 duration:
 236.464 (estimate from bit rate) bitrate=292 kb/s
 [aea @ 000001e7d954b0c0] After avformat_find_stream_info() pos: 25359
 bytes read:1048576 seeks:0 frames:50
 Input #0, aea, from 'F:\Roll In My Sweet Baby's Arms.aea':
   Metadata:
     artist          : The Lonesome Rogues
   Duration: 00:03:56.46, bitrate: 292 kb/s
     Stream #0:0, 50, 1/90000: Audio: atrac1, 44100 Hz, stereo, fltp, 292
 kb/s
 Successfully opened the file.
 Parsing a group of options: output url F:\ffmpeg_test\Roll In My Sweet
 Baby's Arms-ffmpeg.wav.
 Successfully parsed a group of options.
 Opening an output file: F:\ffmpeg_test\Roll In My Sweet Baby's Arms-
 ffmpeg.wav.
 [file @ 000001e7d9553300] Setting default whitelist 'file,crypto,data'
 Successfully opened the file.
 Stream mapping:
   Stream #0:0 -> #0:0 (atrac1 (native) -> pcm_s16le (native))
 Press [q] to stop, [?] for help
 cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless
 if it occurs once at the start per stream)
 detected 8 logical cores
 [graph_0_in_0_0 @ 000001e7d9667ac0] Setting 'time_base' to value '1/44100'
 [graph_0_in_0_0 @ 000001e7d9667ac0] Setting 'sample_rate' to value '44100'
 [graph_0_in_0_0 @ 000001e7d9667ac0] Setting 'sample_fmt' to value 'fltp'
 [graph_0_in_0_0 @ 000001e7d9667ac0] Setting 'channel_layout' to value
 '0x3'
 [graph_0_in_0_0 @ 000001e7d9667ac0] tb:1/44100 samplefmt:fltp
 samplerate:44100 chlayout:0x3
 [format_out_0_0 @ 000001e7d9668040] Setting 'sample_fmts' to value 's16'
 [format_out_0_0 @ 000001e7d9668040] auto-inserting filter
 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter
 'format_out_0_0'
 [AVFilterGraph @ 000001e7d961eb40] query_formats: 4 queried, 6 merged, 3
 already done, 0 delayed
 [auto_resampler_0 @ 000001e7d966a680] [SWR @ 000001e7d966a8c0] Using fltp
 internally between filters
 [auto_resampler_0 @ 000001e7d966a680] ch:2 chl:stereo fmt:fltp r:44100Hz
 -> ch:2 chl:stereo fmt:s16 r:44100Hz
 Output #0, wav, to 'F:\ffmpeg_test\Roll In My Sweet Baby's Arms-
 ffmpeg.wav':
   Metadata:
     IART            : The Lonesome Rogues
     ISFT            : Lavf58.42.100
     Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
 44100 Hz, stereo, s16, 1411 kb/s
     Metadata:
       encoder         : Lavc58.76.100 pcm_s16le
 F:\Roll In My Sweet Baby's Arms.aea: I/O error
 [out_0_0 @ 000001e7d9667e00] EOF on sink link out_0_0:default.
 No more output streams to write to, finishing.
 size=   40712kB time=00:03:56.33 bitrate=1411.2kbits/s speed= 496x
 video:0kB audio:40712kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: 0.000254%
 Input file #0 (F:\Roll In My Sweet Baby's Arms.aea):
   Input stream #0:0 (audio): 20356 packets read (8630944 bytes); 20356
 frames decoded (10422272 samples);
   Total: 20356 packets (8630944 bytes) demuxed
 Output file #0 (F:\ffmpeg_test\Roll In My Sweet Baby's Arms-ffmpeg.wav):
   Output stream #0:0 (audio): 20356 frames encoded (10422272 samples);
 20356 packets muxed (41689088 bytes);
   Total: 20356 packets (41689088 bytes) muxed
 20356 frames successfully decoded, 0 decoding errors
 [AVIOContext @ 000001e7d9553400] Statistics: 4 seeks, 162 writeouts
 [AVIOContext @ 000001e7d95544c0] Statistics: 8635103 bytes read, 0 seeks

 I am happy to provide files for analysis.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/8585>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list