[FFmpeg-user] 5% of audio samples missing when capturing audio on a mac

Edward Park kumowoon1025 at gmail.com
Tue Sep 22 18:29:31 EEST 2020


>> 48000 is certainly a much nicer number when you compare it with the common video framerates (24, 30/1.001, etc. all divide cleanly)
> Can you explain this? I'm trying to get (30/1.001) or the rounded 29.97 to divide 48k cleanly or be a clean ratio but I don't see it. Maybe that with 30/1.001 it's got a denominator of 5, which is pretty small?

Compared to 44.1kHz? 48kHz is 48000 samples per second, and 29.97 (30/1.001) fps is, obviously, 30000/1001 (≈29.97) frames per second - flip that around and you get 1001/30000 seconds duration for each frame.

For each frame there are 1601.6 (16 × 1.001) samples. For 59.97fps, 800.8, for film, 2002 per frame. The 1.001 factor might seem a bit ugly, but that’s kind of why 48 whole kilohertz works much better.

if you think about an mpeg ts system clock timebase of 1/90000 for example, common video or film framerates generally come out to an integer number of 1/90000 second “ticks.” A 29.97fps frame is 3003 “ticks”, which also matches the 1601.6 samples duration. The fractions of samples might make it look like the ratio is not easy to work with, but at 48kHz, one sample has a duration of 1.875 “ticks”, or 15/8 = 30/16

If you replace 48000 with 44100, the numbers aren’t as nice. (Sometimes not even rational? Not sure what combo does that though)

I might be making up the history behind it, but 44.1kHz was basically just workable, with 20kHz assumed to be the “bandwidth” limit of sound intended for people to hear, 40kHz would be needed to encode sound signals that dense, and the extra 4.1kHz would help get rid of artifacts due to aliasing - and probably the biggest factor was the CD. I’m sure they could have pressed much more density into the medium, but the laser tech that was commercially viable at the time to put in players for the general consumer sort of made 44.1kHz a decent detent in the sampling frequency dial in an imaginary sample rate-to-cost estimating machine.

If you actually do the calculations with 44.1kHz, the ratios you get aren’t *too* bad, instead of numbers like 2^3 or 3×5, it’s something like 3×49 or something.

Ted Park

More information about the ffmpeg-user mailing list