[FFmpeg-devel] [PATCH] RealAudio 14.4K encoder

Michael Niedermayer michaelni
Mon May 24 02:53:53 CEST 2010


On Sun, May 23, 2010 at 08:52:30PM +0200, Francesco Lavra wrote:
> On Sun, 2010-05-23 at 00:51 +0200, Michael Niedermayer wrote:
> > On Sat, May 22, 2010 at 07:33:13PM +0200, Francesco Lavra wrote:
> > > > > Floating point, with orthogonalization, with gain quantization done the
> > > > > fast way
> > > > > stddev:  818.14 PSNR: 38.07 bytes:   200320/   200334
> > > > > stddev:  986.48 PSNR: 36.45 bytes:   144000/   144014
> > > > > stddev:  811.68 PSNR: 38.14 bytes:   745280/   745294
> > > > > stddev: 3762.86 PSNR: 24.82 bytes:  5370880/  5370880
> > > > > stddev: 2635.10 PSNR: 27.91 bytes:   814400/   814400
> > > > > stddev: 3647.02 PSNR: 25.09 bytes:   432640/   432640
> > > > > stddev: 2862.79 PSNR: 27.19 bytes:  1741440/  1741440
> > > > 
> > > > some files loose quality by enabling orthogonalization, thats odd but
> > > > possible.
> > > > assuming there is no bug in the orthogonalization then you could try to
> > > > run the quantization with both codebooks found with and without
> > > > orthogonalization, this should always be better. And or avoid codebook
> > > > choices that would need quantization factors that are far away from
> > > > available values
> > > 
> > > The first 3 files are uncompressed recordings, while the last 4 files
> > > are RealAudio decoded samples, so statistics for the latter probably are
> > > not that meaningful.
> > > If you are wondering why PSNR values are so low for the last 4 files
> > > (ideally, they should approach infinity), the problem is that I couldn't
> > > come up with an exact method of calculating the frame energy (assuming
> > > one exists, because from the current decoder output I'm not sure we can
> > > reconstruct the encoded stream exactly as it was), so having an energy
> > > value different form what it ought to be influences negatively the
> > > codebook searches.
> > 
> > how far away is the correct value from what you choose?
> > (if its just +-1 maybe bruteforce search might be an option)
> 
> I chose the formula to calculate the energy such that in most cases it
> is either the correct value or +-1. But a brute force approach on the
> energy value would be extremely slow: you have to re-encode the whole
> frame as many times as the number of energy values you want to try.
> Also, there are the LPC coefficients, whose values don't correspond
> exactly to those of the original encoded stream, so I don't know how
> much improvement a brute force approach on the energy value could bring.
> Last but not least, yesterday a made some mistakes getting the PSNR
> values, messing up with the shift and skip arguments to tiny_psnr: now
> the results are far better :) see below.
> 
> > orthogonalization is a win and should be done of course.
> > the 5 entry quantization needs work, there should be no quality
> > loss. What about 10 or 20 entries?
> 
> Below are the correct results (a bug in the floating point code has been
> fixed too, and PSNR has benefited from that). As you can see, the fast
> gain quantization is as good as the brute force one, so there is no need
> to worry about a mixed approach.
> 
> Fixed point, without orthogonalization, with brute force gain
> quantization
> stddev:  424.27 PSNR: 43.78 bytes:   200000/   200320
> stddev:  263.80 PSNR: 47.90 bytes:   143680/   144000
> stddev:  380.05 PSNR: 44.73 bytes:   744960/   745280
> stddev:  854.26 PSNR: 37.70 bytes:  5370560/  5370880
> stddev:  472.50 PSNR: 42.84 bytes:   814080/   814400
> stddev:  548.55 PSNR: 41.54 bytes:   432320/   432640
> stddev:  428.05 PSNR: 43.70 bytes:  1741120/  1741440
> 
> Floating point, without orthogonalization, with brute force gain
> quantization
> stddev:  422.45 PSNR: 43.81 bytes:   200000/   200320
> stddev:  268.66 PSNR: 47.75 bytes:   143680/   144000
> stddev:  381.76 PSNR: 44.69 bytes:   744960/   745280
> stddev:  851.79 PSNR: 37.72 bytes:  5370560/  5370880
> stddev:  486.95 PSNR: 42.58 bytes:   814080/   814400
> stddev:  568.53 PSNR: 41.23 bytes:   432320/   432640
> stddev:  436.89 PSNR: 43.52 bytes:  1741120/  1741440
> 
> Floating point, with orthogonalization, with brute force gain
> quantization
> stddev:  210.49 PSNR: 49.86 bytes:   200000/   200320
> stddev:  201.69 PSNR: 50.24 bytes:   143680/   144000
> stddev:  200.49 PSNR: 50.29 bytes:   744960/   745280
> stddev:  784.77 PSNR: 38.43 bytes:  5370560/  5370880
> stddev:  422.10 PSNR: 43.82 bytes:   814080/   814400
> stddev:  484.69 PSNR: 42.62 bytes:   432320/   432640
> stddev:  392.32 PSNR: 44.46 bytes:  1741120/  1741440
> 
> Floating point, with orthogonalization, with gain quantization done the
> fast way
> stddev:  210.14 PSNR: 49.88 bytes:   200000/   200320
> stddev:  202.50 PSNR: 50.20 bytes:   143680/   144000
> stddev:  196.30 PSNR: 50.47 bytes:   744960/   745280
> stddev:  786.06 PSNR: 38.42 bytes:  5370560/  5370880
> stddev:  422.29 PSNR: 43.82 bytes:   814080/   814400
> stddev:  495.53 PSNR: 42.43 bytes:   432320/   432640
> stddev:  396.24 PSNR: 44.37 bytes:  1741120/  1741440
> 
> Floating point, with orthogonalization, with gain quantization done
> taking into account the rounding error of the 5 best entries
> stddev:  210.49 PSNR: 49.86 bytes:   200000/   200320
> stddev:  201.69 PSNR: 50.24 bytes:   143680/   144000
> stddev:  200.05 PSNR: 50.31 bytes:   744960/   745280
> stddev:  786.22 PSNR: 38.42 bytes:  5370560/  5370880
> stddev:  419.41 PSNR: 43.88 bytes:   814080/   814400
> stddev:  497.65 PSNR: 42.39 bytes:   432320/   432640
> stddev:  395.23 PSNR: 44.39 bytes:  1741120/  1741440
> 
> I'd say we should go for the fast gain qantization, and in attachment is
> an cleaned up patch for it, with code duplication removed.

the attached code looks like float brute


> I still have to try the iterative method, will do that in a few days I
> think.

great


[...]
> +    best_error = FLT_MAX;
> +    gain = 0;
> +    for (n = 0; n < 256; n++) {
> +        g[1] = ((ff_gain_val_tab[n][1] * m[1]) >> ff_gain_exp_tab[n]) *
> +               (1/4096.0);
> +        g[2] = ((ff_gain_val_tab[n][2] * m[2]) >> ff_gain_exp_tab[n]) *
> +               (1/4096.0);
> +        error = 0;
> +        if (cba_idx) {
> +            g[0] = ((ff_gain_val_tab[n][0] * m[0]) >> ff_gain_exp_tab[n]) *
> +                   (1/4096.0);
> +            for (i = 0; i < BLOCKSIZE; i++) {
> +                data[i] = zero[i] + g[0] * cba[i] + g[1] * cb1[i] +
> +                          g[2] * cb2[i];
> +                error += (data[i] - sblock_data[i]) *
> +                         (data[i] - sblock_data[i]);
> +            }
> +        } else {
> +            for (i = 0; i < BLOCKSIZE; i++) {
> +                data[i] = zero[i] + g[1] * cb1[i] + g[2] * cb2[i];
> +                error += (data[i] - sblock_data[i]) *
> +                         (data[i] - sblock_data[i]);
> +            }
> +        }
> +        if (error < best_error) {
> +            best_error = error;
> +            gain = n;
> +        }
> +    }
[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100524/dfd2eff3/attachment.pgp>



More information about the ffmpeg-devel mailing list