[FFmpeg-user] Preserving perceived loudness when downmixing audio from 5.1 AC3 to stereo AAC

Mon Aug 5 12:58:08 CEST 2013

> -----Original Message-----
> From: Andy Furniss [mailto:adf.lists at gmail.com]
> Sent: 04 August 2013 22:11
> To: FFmpeg user questions
> Cc: Francois Visagie
> Subject: Re: [FFmpeg-user] Preserving perceived loudness when
> downmixing audio from 5.1 AC3 to stereo AAC
> 
> Francois Visagie wrote:
> 
> > I don't think this is the desired outcome, to achieve unified audio
> > processing but with significant reduction in perceived volume in the
> > process. Is there any further testing I can do to help investigate and
> > hopefully improve this situation?
> 
> Down mixing is not ideal. Lower volume is expected.
> 
> If you don't want the result to clip then you have to normalise enough to
> allow for all input channels being full level - even though that may be
rare in
> practice.
> 
> As you are dealing with ac3

Therein lies part of the problem, not all input files are AC3. Up to at
least 30 June -filter:a aformat=channel_layouts=stereo could be used in a
standard command line to produce stereo from multi-channel inputs with input
and output volumes perceivably equal. Now each encode needs to be inspected
individually for input/output differences, and the remedy will in each case
also differ according to input type and/or volume differences. Really
sub-optimal in my view, one which I expect to be more widely shared once
these implications are more widely understood.

I sincerely appreciate the trouble you took with outlining various
principles involved, but, on a more practical level: rather than making
-filter:a aformat=channel_layouts=stereo now share the mechanism of -ac 2
and -filter:a aresample=ocl=3 (incorrectly so wrt. volume levels in my
view), what is the feasibility of making the other two behave like -filter:a
aformat=channel_layouts=stereo instead?

> you should also be aware that a studio produced
> stream will usually carry mata data specifying dynamic range control and
by
> default ffmpeg applies this fully so you get the restricted the range
output.
> Just turning this off with
> 
> -drc_scale 0
> 
> will however produce something that IMHO has too much range - it's not
> actually on/off it's float so you can do in between 0 and 1.
> 
> ac3 also has meta data for downmix so using -request_channels 2 should in
> theory get a downmix as the studio intended - but of course it's still
> normalised to prevent clipping and it's still compromised by the fact it's
a
> downmix.
> 
> I don't know if there's a way with ffmpeg to analyse and adjust the whole
> downmixed track to boost the levels to take advantage of any headroom.
> 
> You can certainly do it with sox (working on pcm)
> 
> Slight digression - last time I looked at dts/dca codecs default downmix
it was
> a clipping stereo squashing mess - nice and loud though, but really. don't
> think it's OK or anything to aim for should you happen to ever test.
> 
> Not much use for your need, but as for the future - the nice thing about
> Dolby Truehd and DTS MA is that they mix up, so a stereo user like me
should
> get an "artistic" studio mix rather than something kludged down.
> 
> AFAIK dts ma isn't supported yet truehd is (but no drc support) - and this
is
> just decode.
> 
> 
> 
>