[Ffmpeg-devel] FLAC encoder

Thu Jun 1 08:48:34 CEST 2006

Justin Ruggles wrote:
> Weak point: I have found a few very noisy samples that do not compress
> quite as well as with libFLAC.  This may have to do with the fact that
> my encoder does not use verbatim encoding mode at all...I'm not 100%
> sure though.

Hi,

Well, the FLAC encoder is getting better by the day.  I'll post some
results some time soon of my most recent version, which has marked
improvement in compression.  The selection of optimal rice parameters
and partition order has nearly ceased being a bottleneck.  Using the
fastest vs. slowest version only made a marginal speed difference and
hurt compression quality slightly (probably just precision error), so
I've redirected my concern to improving the LPC optimization algorithms
and other things.

One area I've been trying to improve upon is the weak point I quoted
above.  I was really annoyed and challenged :) when I found a website
devoted to collecting audio samples that are notoriously difficult to
encode.  libFLAC beat my encoder hands down in almost every single test!
 So my challenge has been to match or at least get close to libFLAC on
these "problem" type samples.  Most of them have complex layered sounds
and high dynamic range or they have noisy types of sounds like cymbals
or distortion guitar.

I've been doing lots of research and even more testing.  I am writing
though because I need some advice from some people who know much more
about audio encoding than I do (and what better place to ask?).  My math
background isn't what I wish it was.  I'm just learning as I go and
reading as much as I can.

So, one thing I've found that could really help the compression quality
is variable block sizes.  Using different block sizes in a single stream
is not valid "subset" FLAC, but decoding is supported in both libFLAC
and FFmpeg, so I see no reason to ignore this potentially very
beneficial feature.  When I manually adjust the static block size I can
get drastically better results.  I can only imagine what being able to
adapt it during encoding could do.

A large block size seems to be ideal for most general audio sources, but
the more difficult samples encode better with smaller block sizes.  From
what I gather, doing some sort of transient detection on the block and
then splitting it into 2 smaller blocks if warranted (and doing this
recursively) could greatly increase the compression.  Does anyone know
what might be some good algorithms to employ?  Or at least a good place
where I could start my research?

Thanks,
Justin