[FFmpeg-user] volume normalization: loudnorm versus replaygain; which is better?

Wed Nov 9 11:17:40 EET 2022

On 8/2/22, James Ralston <ralston at pobox.com> wrote:
> When I record videos with my phone (a Google Pixel 6), the audio track
> of those videos tends to be too quiet, especially if I recorded in a
> quiet environment.  I want to normalize the volume of those videos
> when I play them.
>
> I see basically two ways to do this:
>
> 1.  Rewrite the audio track of the video to normalize its loudness.
>
> 2.  Calculate the ReplayGain of the audio track of the video and embed
>     the ReplayGain information as metadata, so devices that play back
>     the video will find the ReplayGain tags and normalize the volume
>     for playback.
>
> Accomplishing #1 with ffmpeg is easy:
>
> $ ffmpeg -i too-quiet.mp4 -filter:a loudnorm -vcodec copy better.mp4

This is completely valid way to ruin audio permanently.

The only usefulness of loudnorm filter is way to normalize audio
by providing measured parameters prior filtering.
And that is accomplished only by dual pass processing.

>
> But: is this the best approach?
>
> Because in the audio world, the preferred way to handle variable
> loudness of songs (e.g., if you are ripping old CDs) is to apply
> ReplayGain tags.  This leaves the original audio intact, and just
> applies volume correction at playback.  The majority of players
> recognize and obey ReplayGain tags.
>
> To me, it feels like the ReplayGain strategy is better than rewriting
> the original audio, for at least two reasons:
>
> 1. Re-encoding the audio track is going to introduce additional loss.
>
> 2. If, say, subsequent improvements are made to the loudnorm filter,
>    unless I retain a copy of the original (unmodified) video, I can’t
>    go back and re-apply the new-and-improved loudnorm filter, because
>    I destroyed the original audio track the first time I applied it.
>
> If I wanted to use the ReplayGain approach, the calculate part is
> easy:
>
> $ ffmpeg -i too-quiet.mp4 -c:v copy -af replaygain foo.mp4
> …
> [Parsed_replaygain_0 @ 0x55dc42d5c940] track_gain = +45.88 dB
> [Parsed_replaygain_0 @ 0x55dc42d5c940] track_peak = 0.021767
>
> But it’s not clear to me from the ffmpeg documentation how I would
> actually *add* the ReplayGain metadata to the audio track of the video
> once I have calculated it.
>
> And it’s also not clear to me whether this approach would actually
> work in the real world.  If the majority of video playback devices
> (e.g. smartphones) and software (e.g. social media sites) ignore
> ReplayGain metadata tags in the audio tracks of videos, then while
> this approach might be the best from a audio purist perspective, it
> won’t have the practical effect of making a quiet video playback at a
> normalized volume.
>
> I can’t be the only person to have encountered this situation.  Is
> there currently a “best practices” for audio track loudness
> normalization in the video world?  Will using the ReplayGain metadata
> work?  Or should I just give up and use the loudnorm filter for now,
> and save the original videos so that when ReplayGain metadata is
> better supported, I can go back and just add ReplayGain metadata to
> the original videos and discard the versions that I created with the
> loudnorm filter?

You should try not to use loudnorm filter in dynamic mode if you care
for dynamics of audio.
With two-pass loudnorm mode (the linear mode), it works like
replaygain, It will just multiply all audio samples with same value.

If you do not care for audio dynamics and just want quiet parts to
become louder and louder parts to become less louder than there are
more sophisticated filters in FFmpeg and far more efficient.

>
> Thanks in advance for any advice!
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>