[Ffmpeg-devel] FLAC encoder

Michael Niedermayer michaelni
Thu Jun 1 11:38:54 CEST 2006


Hi

On Thu, Jun 01, 2006 at 02:48:34AM -0400, Justin Ruggles wrote:
> Justin Ruggles wrote:
> > Weak point: I have found a few very noisy samples that do not compress
> > quite as well as with libFLAC.  This may have to do with the fact that
> > my encoder does not use verbatim encoding mode at all...I'm not 100%
> > sure though.
> 
> Hi,
> 
> Well, the FLAC encoder is getting better by the day.  I'll post some
> results some time soon of my most recent version, which has marked
> improvement in compression.  The selection of optimal rice parameters
> and partition order has nearly ceased being a bottleneck.  Using the
> fastest vs. slowest version only made a marginal speed difference and
> hurt compression quality slightly (probably just precision error), so
> I've redirected my concern to improving the LPC optimization algorithms
> and other things.
> 
> One area I've been trying to improve upon is the weak point I quoted
> above.  I was really annoyed and challenged :) when I found a website
> devoted to collecting audio samples that are notoriously difficult to
> encode.  libFLAC beat my encoder hands down in almost every single test!
>  So my challenge has been to match or at least get close to libFLAC on
> these "problem" type samples.  Most of them have complex layered sounds
> and high dynamic range or they have noisy types of sounds like cymbals
> or distortion guitar.
> 
> I've been doing lots of research and even more testing.  I am writing
> though because I need some advice from some people who know much more
> about audio encoding than I do (and what better place to ask?).  My math
> background isn't what I wish it was.  I'm just learning as I go and
> reading as much as I can.
> 
> So, one thing I've found that could really help the compression quality
> is variable block sizes.  Using different block sizes in a single stream
> is not valid "subset" FLAC, but decoding is supported in both libFLAC
> and FFmpeg, so I see no reason to ignore this potentially very
> beneficial feature.  When I manually adjust the static block size I can
> get drastically better results.  I can only imagine what being able to
> adapt it during encoding could do.

ok, but genrating such variable block streams should be optional as some
decoders might not support it


> 
> A large block size seems to be ideal for most general audio sources, but
> the more difficult samples encode better with smaller block sizes.  From
> what I gather, doing some sort of transient detection on the block and
> then splitting it into 2 smaller blocks if warranted (and doing this
> recursively) could greatly increase the compression.  Does anyone know
> what might be some good algorithms to employ?  Or at least a good place
> where I could start my research?

one obvious choice if you want optimal blocksizes is to use the
viterbi algorithm, this can be very slow depending on the block sizes
and other parameters you consider, with a limited set of block sizes though
it should be fine


another non optimal choice which is simpler is a recursive split algo, that 
obviously will be limited to blocksizes which are a power of 2 (2,4,8,16,32,...)
and also limited to some specific recursive split pattern
the split decission can either be done based on a heuristic (like the
transient detection you suggest) or by bruteforce, obviosuly bruteforce will
be slower ...

the bruteforce could be done by some code like:

int build_split_tree(pos, blockorder){
    int noplit_bits, split_bits;
    int blocksize= 1<<blockorder;

    if(blocksize < min_blocksize)
        return 99999999;

    noplit_bits= estimate_the_number_of_bits(pos * blocksize, blocksize);
    split_bits = build_split_tree(2*pos  , blockorder-1);
    split_bits+= build_split_tree(2*pos+1, blockorder-1);

    if(split_bits < no_split_bits){
        split_tree[blockorder][pos]=1;
        return split_bits;
    }else{
        split_tree[blockorder][pos]=0;
        return nosplit_bits;
    }
}

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is




More information about the ffmpeg-devel mailing list