[FFmpeg-trac] #7028(avfilter:new): Improper rounding of output sample rate when using libopus

FFmpeg trac at avcodec.org
Sun Feb 18 05:37:04 EET 2018


#7028: Improper rounding of output sample rate when using libopus
----------------------------------+--------------------------------------
             Reporter:  heicrd    |                     Type:  defect
               Status:  new       |                 Priority:  normal
            Component:  avfilter  |                  Version:  git-master
             Keywords:            |               Blocked By:
             Blocking:            |  Reproduced by developer:  0
Analyzed by developer:  0         |
----------------------------------+--------------------------------------
 Summary of the bug:
 When used in combination with `libopus`'s limited selection of supported
 sample rates, an automatically-inserted `aresample` filter will round the
 sample rate of the audio to the nearest supported sample rate, potentially
 rounding down and incurring a significant and unexpected loss in fidelity.
 For example, a 32khz input will be downsampled to 24khz instead of
 upsampled to 48khz as `opusenc` seems to do.

 How to reproduce:
 Use ffmpeg to encode a 32khz file with libopus
 {{{
 % ./ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -codec:a pcm_s16le
 -af aresample=32000 -f wav - | ./ffmpeg -v 9 -loglevel 99 -f wav -i -
 -codec:a libopus -f ogg -y /dev/null
 ffmpeg version N-90069-gdd8351b118ffmpeg version N-90069-gdd8351b118
 Copyright (c) 2000-2018 the FFmpeg developers
   built with gcc 7 (Debian 7.3.0-3)
  Copyright (c) 2000-2018 the FFmpeg developers  configuration: --enable-
 libopus

   built with gcc 7 (Debian 7.3.0-3)
   configuration: --enable-libopus
   libavutil      56.  7.101 / 56.  7.101
   libavcodec     58. 11.101 / 58. 11.101
   libavformat    58.  9.100 / 58.  9.100
   libavdevice    58.  1.100 / 58.  1.100
   libavfilter     7. 12.100 /  7. 12.100
   libswscale      5.  0.101 /  5.  0.101
   libswresample   3.  0.101 /  3.  0.101
   libavutil      56.  7.101 / 56.  7.101
   libavcodec     58. 11.101 / 58. 11.101
   libavformat    58.  9.100 / 58.  9.100
   libavdevice    58.  1.100 / 58.  1.100
   libavfilter     7. 12.100 /  7. 12.100
   libswscale      5.  0.101 /  5.  0.101
 Splitting the commandline.
   libswresample   3.  0.101 /  3.  0.101
 Reading option '-v' ... matched as option 'v' (set logging level) with
 argument '9'.
 Reading option '-loglevel' ... matched as option 'loglevel' (set logging
 level) with argument '99'.
 Reading option '-f' ... matched as option 'f' (force format) with argument
 'wav'.
 Reading option '-i' ... matched as input url with argument '-'.
 Reading option '-codec:a' ... matched as option 'codec' (codec name) with
 argument 'libopus'.
 Reading option '-f' ... matched as option 'f' (force format) with argument
 'ogg'.
 Reading option '-y' ... matched as option 'y' (overwrite output files)
 with argument '1'.
 Reading option '/dev/null' ... matched as output url.
 Finished splitting the commandline.
 Parsing a group of options: global .
 Applying option v (set logging level) with argument 9.
 Applying option y (overwrite output files) with argument 1.
 Successfully parsed a group of options.
 Parsing a group of options: input url -.
 Applying option f (force format) with argument wav.
 Successfully parsed a group of options.
 Opening an input file: -.
 [wav @ 0x5647ff2d2340] Opening 'pipe:' for reading
 [pipe @ 0x5647ff2d2ec0] Setting default whitelist 'crypto'
 Input #0, lavfi, from 'sine=frequency=1000:duration=5':
   Duration: N/A, start: 0.000000, bitrate: 705 kb/s
     Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
 Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
 Press [q] to stop, [?] for help
 Output #0, wav, to 'pipe:':
   Metadata:
     ISFT            : Lavf58.9.100
     Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 32000 Hz, mono,
 s16, 512 kb/s
     Metadata:
       encoder         : Lavc58.11.101 pcm_s16le
 [wav @ 0x5647ff2d2340] Ignoring maximum wav data size, file may be invalid
 [wav @ 0x5647ff2d2340] Before avformat_find_stream_info() pos: 78 bytes
 read:66920 seeks:0 nb_streams:1
 [wav @ 0x5647ff2d2340] probing stream 0 pp:32
 [wav @ 0x5647ff2d2340] probing stream 0 pp:31
 [wav @ 0x5647ff2d2340] probing stream 0 pp:30
 [wav @ 0x5647ff2d2340] probing stream 0 pp:29
 [wav @ 0x5647ff2d2340] probing stream 0 pp:28
 [wav @ 0x5647ff2d2340] probing stream 0 pp:27
 [wav @ 0x5647ff2d2340] probing stream 0 pp:26
 [wav @ 0x5647ff2d2340] probing stream 0 pp:25
 [wav @ 0x5647ff2d2340] probing stream 0 pp:24
 [wav @ 0x5647ff2d2340] probing stream 0 pp:23
 [wav @ 0x5647ff2d2340] probing stream 0 pp:22
 [wav @ 0x5647ff2d2340] probing stream 0 pp:21
 [wav @ 0x5647ff2d2340] probing stream 0 pp:20
 [wav @ 0x5647ff2d2340] probing stream 0 pp:19
 [wav @ 0x5647ff2d2340] probing stream 0 pp:18
 [wav @ 0x5647ff2d2340] probing stream 0 pp:17
 [wav @ 0x5647ff2d2340] probing stream 0 pp:16
 [wav @ 0x5647ff2d2340] probing stream 0 pp:15
 [wav @ 0x5647ff2d2340] probing stream 0 pp:14
 [wav @ 0x5647ff2d2340] probing stream 0 pp:13
 [wav @ 0x5647ff2d2340] probing stream 0 pp:12
 [wav @ 0x5647ff2d2340] probing stream 0 pp:11
 [wav @ 0x5647ff2d2340] probing stream 0 pp:10
 [wav @ 0x5647ff2d2340] probing stream 0 pp:9
 [wav @ 0x5647ff2d2340] probing stream 0 pp:8
 [wav @ 0x5647ff2d2340] probing stream 0 pp:7
 [wav @ 0x5647ff2d2340] probing stream 0 pp:6
 [wav @ 0x5647ff2d2340] probing stream 0 pp:5
 [wav @ 0x5647ff2d2340] probing stream 0 pp:4
 [wav @ 0x5647ff2d2340] probing stream 0 pp:3
 [wav @ 0x5647ff2d2340] probing stream 0 pp:2
 [wav @ 0x5647ff2d2340] probing stream 0 pp:1
 [wav @ 0x5647ff2d2340] probed stream 0
 [wav @ 0x5647ff2d2340] parser not found for codec pcm_s16le, packets or
 times may be invalid.
 [wav @ 0x5647ff2d2340] All info found
 [wav @ 0x5647ff2d2340] stream 0: start_time: -288230376151711.750
 duration: -288230376151711.750
 [wav @ 0x5647ff2d2340] format: start_time: -9223372036854.775 duration:
 -9223372036854.775 bitrate=512 kb/s
 [wav @ 0x5647ff2d2340] After avformat_find_stream_info() pos: 204878 bytes
 read:205124 seeks:0 frames:50
 Guessed Channel Layout for Input Stream #0.0 : mono
 Input #0, wav, from 'pipe:':
   Metadata:
     encoder         : Lavf58.9.100
   Duration: N/A, bitrate: 512 kb/s
     Stream #0:0, 50, 1/32000: Audio: pcm_s16le ([1][0][0][0] / 0x0001),
 32000 Hz, mono, s16, 512 kb/s
 Successfully opened the file.
 Parsing a group of options: output url /dev/null.
 Applying option codec:a (codec name) with argument libopus.
 Applying option f (force format) with argument ogg.
 Successfully parsed a group of options.
 Opening an output file: /dev/null.
 [file @ 0x5647ff2f5e40] Setting default whitelist 'file,crypto'
 Successfully opened the file.
 Stream mapping:
   Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
 cur_dts is invalid (this is harmless if it occurs once at the start per
 stream)
 detected 8 logical cores
 [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'time_base' to value '1/32000'
 [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_rate' to value '32000'
 [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'sample_fmt' to value 's16'
 [graph_0_in_0_0 @ 0x5647ff320c80] Setting 'channel_layout' to value '0x4'
 [graph_0_in_0_0 @ 0x5647ff320c80] tb:1/32000 samplefmt:s16
 samplerate:32000 chlayout:0x4
 [format_out_0_0 @ 0x5647ff320f40] Setting 'sample_fmts' to value 's16|flt'
 [format_out_0_0 @ 0x5647ff320f40] Setting 'sample_rates' to value
 '48000|24000|16000|12000|8000'
 [format_out_0_0 @ 0x5647ff320f40] auto-inserting filter 'auto_resampler_0'
 between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
 [AVFilterGraph @ 0x5647ff2f7400] query_formats: 4 queried, 6 merged, 3
 already done, 0 delayed
 [auto_resampler_0 @ 0x5647ff324580] [SWR @ 0x5647ff324a80] Using s16p
 internally between filters
 [auto_resampler_0 @ 0x5647ff324580] ch:1 chl:mono fmt:s16 r:32000Hz ->
 ch:1 chl:mono fmt:s16 r:24000Hz
 [libopus @ 0x5647ff2f5700] No bit rate set. Defaulting to 64000 bps.
 Output #0, ogg, to '/dev/null':
   Metadata:
     encoder         : Lavf58.9.100
     Stream #0:0, 0, 1/48000: Audio: opus (libopus), 24000 Hz, mono, s16,
 delay 156, 64 kb/s
     Metadata:
       encoder         : Lavc58.11.101 libopus
 [Parsed_sine_0 @ 0x55a947218500] EOF timestamp not reliable
 size=     313kB time=00:00:05.00 bitrate= 512.1kbits/s speed= 107x
 video:0kB audio:312kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: 0.024375%
 [out_0_0 @ 0x5647ff321cc0] EOF on sink link out_0_0:default.
 No more output streams to write to, finishing.
 [libopus @ 0x5647ff2f5700] Trying to remove 324 more samples than there
 are in the queue
 size=      60kB time=00:00:05.01 bitrate=  98.0kbits/s speed= 143x
 video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB
 muxing overhead: 1.050050%
 Input file #0 (pipe:):
   Input stream #0:0 (audio): 79 packets read (320000 bytes); 79 frames
 decoded (160000 samples);
   Total: 79 packets (320000 bytes) demuxed
 Output file #0 (/dev/null):
   Output stream #0:0 (audio): 250 frames encoded (120000 samples); 251
 packets muxed (60759 bytes);
   Total: 251 packets (60759 bytes) muxed
 79 frames successfully decoded, 0 decoding errors
 [AVIOContext @ 0x5647ff2f60c0] Statistics: 0 seeks, 8 writeouts
 [AVIOContext @ 0x5647ff2db340] Statistics: 320078 bytes read, 0 seeks
 }}}
 Among the output is
 {{{
 [auto_resampler_0 @ 0x55e656ac6a80] ch:1 chl:mono fmt:s16 r:32000Hz ->
 ch:1 chl:mono fmt:s16 r:24000Hz
 }}}
 Indicating that the 32khz file was downsampled to 24khz.

 This can be worked around by manually specifying `-af resample=48000`,
 however ffmpeg makes no indication that any downsampling was performed
 unless verbose logging is enabled.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/7028>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list