[FFmpeg-devel] [PATCH] RealAudio 14.4K encoder

Francesco Lavra francescolavra
Sun May 16 16:46:37 CEST 2010


On Mon, 2010-05-10 at 12:16 +0200, Michael Niedermayer wrote:
> > > > +/**
> > > > + * Calculates match score and gain of an LPC-filtered vector with respect to
> > > > +   input data
> > > > + * @param block array used to calculate the filtered vector
> > > > + * @param coefs coefficients of the LPC filter
> > > > + * @param vect original vector
> > > > + * @param data input data
> > > > + * @param score pointer to variable where match score is returned
> > > > + * @param gain pointer to variable where gain is returned
> > > > + */
> > > > +static void get_match_score(int16_t *block, const int16_t *coefs,
> > > > +                            const int16_t *vect, const int16_t *data,
> > > > +                            float *score, int *gain)
> > > > +{
> > > > +    float c, g;
> > > > +    int i;
> > > > +
> > > > +    if (ff_celp_lp_synthesis_filter(block, coefs, vect, BLOCKSIZE, LPC_ORDER, 1,
> > > > +                                    0x800)) {
> > > > +        *score = 0;
> > > > +        return;
> > > > +    }
> > > > +    c = g = 0;
> > > > +    for (i = 0; i < BLOCKSIZE; i++) {
> > > > +        g += block[i] * block[i];
> > > > +        c += data[i] * block[i];
> > > > +    }
> > > > +    if (!g || (c <= 0)) {
> > > 
> > > the !g check is redundant
> > 
> > Why? If a codebook vector gets zeroed by the LPC filter, g will be zero,
> > and we don't want the match score to be NaN.
> 
> g can only be 0 if all block[i] are 0
> if all block[i] are 0 then c is 0

Right, fixed.

> > > > +
> > > > +
> > > > +/**
> > > > + * Performs gain quantization
> > > > + * @param block array used to calculate filtered vectors
> > > > + * @param lpc_coefs coefficients of the LPC filter
> > > > + * @param cba_vect vector containing the best entry from the adaptive codebook,
> > > > + *                 or NULL if the adaptive codebook is not used
> > > > + * @param cb1_idx index of the best entry of the first fixed codebook
> > > > + * @param cb2_idx index of the best entry of the second fixed codebook
> > > > + * @param rms RMS of the reflection coefficients
> > > > + * @param data input data
> > > > + * @return index of the best entry of the gain table
> > > > + */
> > > > +static int quantize_gains(int16_t *block, const int16_t *lpc_coefs,
> > > > +                          const int16_t *cba_vect, int cb1_idx, int cb2_idx,
> > > > +                          unsigned int rms, const int16_t* data)
> > > > +{
> > > > +    float distance, best_distance;
> > > > +    int i, n, index;
> > > > +    unsigned int m[3];
> > > > +    int16_t exc[BLOCKSIZE]; /**< excitation vector */
> > > > +
> > > > +    if (cba_vect)
> > > > +        m[0] = (irms(cba_vect) * rms) >> 12;
> > > > +    m[1] = (cb1_base[cb1_idx] * rms) >> 8;
> > > > +    m[2] = (cb2_base[cb2_idx] * rms) >> 8;
> > > > +    best_distance = -1;
> > > 
> > > FLOAT_MAX
> > 
> > If you meant MAXFLOAT, fixed.
> 
> i meant FLT_MAX as in the C spec

Fixed.
 
> > > > +    for (n = 0; n < 256; n++) {
> > > > +        distance = 0;
> > > > +        add_wav(exc, n, (int)cba_vect, m, cba_vect, cb1_vects[cb1_idx],
> > > > +                cb2_vects[cb2_idx]);
> > > > +        if (ff_celp_lp_synthesis_filter(block, lpc_coefs, exc, BLOCKSIZE,
> > > > +                                        LPC_ORDER, 1, 0xfff))
> > > > +            continue;
> > > > +        for (i = 0; i < BLOCKSIZE; i++)
> > > > +            distance += (block[i] - data[i]) * (block[i] - data[i]);
> > > > +        if ((distance < best_distance) || (best_distance < 0)) {
> > > > +            best_distance = distance;
> > > > +            index = n;
> > > > +        }
> > > > +    }
> > > 
> > > id guess this could be done faster than by brute force
> > 
> > I can't think of any algorithm which avoids searching the entire table
> > without risking to miss the optimal entry;
> 
> if you ignore rounding then add_wav does a weighted sum of 3 vectors
> x*A + y*B + z*C
> ff_celp_lp_synthesis_filter performs a linear transformation on the result
> (that could be viewed as a matrix multiplication)
> M*(x*A + y*B + z*C)
> and it adds the past samples padded with zero and filtered similarly
> M*(x*A + y*B + z*C) + P
> 
> this can also be written as: (its just the distributive law)
> x*M*A + y*M*B + z*M*C + P
> 
> that is to do something like ff_celp_lp_synthesis_filter() 
> 1. on the past samples padded with zeros
> 2. on all the 3 code book vectors
> but note this ff_celp_lp_synthesis_filter() must not clip values, so maybe
> the float counterpart (ff_celp_lp_synthesis_filterf) would be easiest here
> 
> after that you only have to find the best x,y,z from the table
> that minimize that (this can be done fast as well but lets look into this
> once this optimization suceedded (or failed))

Done as you suggested, using ff_celp_lp_synthesis_filterf().

> Also once you found the best x,y,z with unclipped floats, it makes
> sense to run something like the current brute force loop on all
> entries that scored well in the optimized case. So we do not skip
> considering rounding

In the attached implementation, the five best gain entries are found
with floating point data, and for each of these entries the brute force
method calculates the actual encoding error. Five is a tweakable
parameter.

> > > > +    /**
> > > > +     * TODO: orthogonalize the best entry of the adaptive codebook with the
> > > > +     * basis vectors of the first fixed codebook, and the best entry of the
> > > > +     * first fixed codebook with the basis vectors of the second fixed codebook.
> > > > +     */
> > > 
> > > yes, also shouldnt the search be iterative instead of just one pass?
> > 
> > I tried inserting several iteration runs to find the optimal entries of
> > the fixed codebooks, but rarely the entries found on the second and
> > subsequent iterations are different from the first chioces, and in any
> > case I couldn't hear any improvement in quality, so the iterative method
> > doesn't seem to bring any added value.
> 
> did you orthogonalize the entries?

I didn't, but now I have. No iterative search is performed, since in the
algorithm at http://focus.ti.com/lit/an/spra136/spra136.pdf there is no
mention of multiple iterations.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 05_ra144enc.patch
Type: text/x-patch
Size: 23208 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100516/bea22c8b/attachment.bin>



More information about the ffmpeg-devel mailing list