[FFmpeg-devel] [PATCH] Common ACELP code & G.729 [6/7] - G.729 postfilter

Mon May 12 00:55:49 CEST 2008

On Sun, May 11, 2008 at 10:01:41PM +0700, Vladimir Voroshilov wrote:
> 2008/5/8 Michael Niedermayer <michaelni at gmx.at>:
> > On Fri, May 02, 2008 at 06:49:43PM +0700, Vladimir Voroshilov wrote:
> >> Patch contains G.729 postfilter.
> >> It was separated due to large size to help reviewing.
> >> G.729 can produce audible speech even without this postfilter.
> >>
> >> --
> >> Regards,
> >> Vladimir Voroshilov mailto:voroshil at gmail.com
> >> JID: voroshil at gmail.com, voroshil at jabber.ru
> >> ICQ: 95587719
> >
> >> diff --git a/libavcodec/g729postfilter.c b/libavcodec/g729postfilter.c
> >> new file mode 100644
> >> index 0000000..b09d463
> >> --- /dev/null
> >> +++ b/libavcodec/g729postfilter.c
> >> @@ -0,0 +1,704 @@
> >> +#include <inttypes.h>
> >> +#include <limits.h>
> >> +
> >> +#include "avcodec.h"
> >> +#include "g729.h"
> >> +#include "acelp_pitch_lag.h"
> >> +#include "g729postfilter.h"
> >> +#include "acelp_math.h"
> >> +#include "acelp_filters.h"
> >> +
> >> +#define FRAC_BITS 15
> >> +#include "mathops.h"
> >> +
> >

> >> +/**
> >> + * formant_pp_factor_num_pow[i] = FORMANT_PP_FACTOR_NUM^i
> >> + */
> >> +static const int16_t formant_pp_factor_num_pow[11]=
> >> +{
> >> +  /* (0.15) */
> >> +  32768, 18022, 9912, 5451, 2998, 1649, 907, 499, 274, 151, 83
> >     ^^^^^
> > doesnt fit in int16_t, it does fir in uint16_t though
> 
> Reduced by 1

hmm ad which is correct? or is it unused?

[...]
> >
> >> +    /* Now lp_gn (starting with 10) contains impulse response of A(z/FORMANT_PP_FACTOR_NUM)/A(z/FORMANT_PP_FACTOR_DEN) filter */
> >> +
> >
> >> +    /* 4.2.3, Equation 87, calcuate rh(0)  */
> >> +    rh0 = sum_of_squares(lp_gn + 10, 20, 0, 0) << 1;   // Q24 -> Q9
> >
> >> +    temp = 30 - av_log2(rh0);
> >> +    if(temp > 16)
> >> +        rh0 <<= temp - 16;
> >> +    else
> >> +        rh0 >>= 16 - temp;
> >> +
> > [...]
> >> +    if(temp > 16)
> >> +        rh1 <<= temp - 16;
> >> +    else
> >> +        rh1 >>= 16 - temp;
> >
> > looks useless, this just reduces precission
> 
> This shift guaranties that later "(rh1<<15)/rh0" will
> not cause int overflow.

only the >> case can have that effect, the << is still useless

> 
> >
> >
> > [...]
> >> +            if(exp_after - exp_before - 1 > 0)
> >> +                gain <<= exp_after - exp_before - 1;
> >> +            else
> >> +                gain >>= exp_before - exp_after + 1;
> >
> > This stuff is duplicated all over the place and should be in a fuction
> 
> Fixed.
> 
> 
> > Anyway the whole postfilter looks like it needs some serious cleanup.
> 
> Sure.
> Postfilter is most complicated part of entire G.729 decoder.
> And good cleanup is not made for it yet.

well, look at ra144.c if you want to see worse ...
Iam actually wondering if it has some common code to ACELPs. Somehow
it did look like it has but its quite obfuscated so i could be totally
wrong ...

[...]
> +/**
> + * formant_pp_factor_num_pow[i] = FORMANT_PP_FACTOR_NUM^i
> + */
> +static const int16_t formant_pp_factor_num_pow[11]=
> +{
> +  /* (0.15) */
> +  32767, 18022, 9912, 5451, 2998, 1649, 907, 499, 274, 151, 83
> +};
> +
> +/**
> + * formant_pp_factor_den_pow[i] = FORMANT_PP_FACTOR_DEN^i
> + */
> +static const int16_t formant_pp_factor_den_pow[11]=
> +{
> +  /* (0.15) */
> +  32768, 22938, 16057, 11240, 7868, 5508, 3856, 2699, 1889, 1322, 925
> +};

hmm 32768 ...

[...]
> +/**
> + * \brief Residual signal calculation (4.2.1 if G.729)
> + * \param out [out] (Q0) output data filtered through A(z/FORMANT_PP_FACTOR_NUM)
> + * \param filter_coeffs (Q12) A(z/FORMANT_PP_FACTOR_NUM) filter coefficients
> + * \param in (Q0) input speech data to process
> + * \param subframe_size size of one subframe
> + *
> + * \note in buffer must contain 10 items of previous speech data before top of the buffer
> + * \remark It is safe to pass the same buffer for input and output.
> + */
> +static void g729_residual(
> +        int16_t* out,
> +        const int16_t* filter_coeffs,
> +        const int16_t* in,
> +        int subframe_size)
> +{
> +    int i, n;
> +
> +    /*
> +      4.2.1, Equation 79 Residual signal calculation
> +      ( filtering through A(z/FORMANT_PP_FACTOR_NUM) , one half of short-term filter)
> +    */
> +    for(n=subframe_size-1; n>=0; n--)
> +    {
> +        int sum = 0x800;
> +        for(i=0; i<10; i++)
> +            sum += filter_coeffs[i] * in[n-i-1];
> +
> +        out[n] = in[n] + (sum >> 12);
> +    }
> +}

can this be merged with ff_acelp_lp_synthesis_filter() ?

[...]
> +        corr_int_num = INT_MIN;
> +        best_delay_int = 0;
> +        for(i=pitch_delay_int-1; i<=pitch_delay_int+1; i++)
> +        {
> +            sum = FFMAX(sum_of_squares(sig_scaled - i + RES_PREV_DATA_SIZE, subframe_size, i, 0), 0);
> +            if(sum > corr_int_num)
> +            {
> +                corr_int_num = sum;
> +                best_delay_int = i;
> +            }
> +        }
> +        if(!corr_int_num)
> +            break;

i think the FFMAX is unneeded
with corr_int_num= 0 and best_delay_int= pitch_delay_int-1 as initial values

> +
> +        /*
> +          Compute denominator of pseudo-normalized correlation R'(0)
> +        */
> +        corr_int_den = sum_of_squares(sig_scaled - best_delay_int + RES_PREV_DATA_SIZE, subframe_size, 0, 0);
> +        if (!corr_int_den)
> +            break;
> +
> +        /*
> +          Compute signals with non-integer delay k (with 1/8 precision), where k is in [0;6] range.
> +          Entire delay is qual to best_delay+(k+1)/8
> +          This is archieved by applying an interpolation filter of legth 33
> +          to source signal.
> +        */
> +        for(k=0; k<ANALYZED_FRAC_DELAYS; k++)
> +        {
> +            ff_acelp_interpolate_pitch_vector(
> +                    &delayed_signal[k][0],
> +                    &sig_scaled[RES_PREV_DATA_SIZE],
> +                    ff_g729_interp_filt_short,
> +                    ANALYZED_FRAC_DELAYS+1,
> +                    best_delay_int+1,
> +                    -k-1,
> +                    SHORT_INT_FILT_LEN,
> +                    subframe_size+1
> +                    );
> +        }
> +
> +        /*
> +          Compute denominator of pseudo-normalized correlation R'(k)
> +          (4.2.1, Equation 81).
> +
> +             corr_den[k][0] is square root of R'(k) denominator, for int(T) == int(T0)
> +             corr_den[k][1] is square root of R'(k) denominator, for int(T) == int(T0)+1
> +
> +          Compute also maximum value of above denominators over all k.
> +        */
> +        tmp = corr_int_den;
> +        for(k=0; k<ANALYZED_FRAC_DELAYS; k++)
> +        {
> +            sum = sum_of_squares(&delayed_signal[k][1], subframe_size - 1, 0, 0);
> +            corr_den[k][0] = sum + delayed_signal[k][0            ] * delayed_signal[k][0            ];
> +            corr_den[k][1] = sum + delayed_signal[k][subframe_size] * delayed_signal[k][subframe_size];
> +
> +            tmp = FFMAX3(tmp, corr_den[k][0], corr_den[k][1]);
> +        }
> +
> +        sh_gain_den = av_log2(tmp) - 14;
> +        if(sh_gain_den < 0)
> +            break;
> +
> +        sh_gain_num =  FFMAX(sh_gain_den, sh_ener);
> +        /*
> +          Loop through all k and find delay that maximizes R'(k) correlation.
> +          Search is done in [int(T0)-1; intT(0)+1] range with 1/8 precision
> +        */
> +        delayed_signal_offset = 1;
> +        best_delay_frac = 0;
> +        gain_den = corr_int_den >> sh_gain_den;
> +        gain_num = corr_int_num >> sh_gain_num;
> +        gain_num_square = gain_num * gain_num;
> +        for(k=0; k<ANALYZED_FRAC_DELAYS; k++)
> +        {
> +            for(i=0; i<2; i++)
> +            {
> +                int16_t gain_num_short, gain_den_short;
> +                int gain_num_short_square;
> +                /*
> +                  Compute numerator of pseudo-normalized correlation R'(k)
> +                  (4.2.1, Equation 81)
> +                */
> +                sum = 0;
> +                for(n=0; n<subframe_size; n++)
> +                    sum += delayed_signal[k][n+i] * sig_scaled[n + RES_PREV_DATA_SIZE];
> +                gain_num_short = FFMAX(sum >> sh_gain_num, 0);
> +
> +                /*
> +
> +                              gain_num_short_square                gain_num_square
> +                   R'(T)^2 = -----------------------, max R'(T)^2= --------------
> +                                   den                                 gain_den
> +
> +                */
> +                gain_num_short_square = gain_num_short * gain_num_short;
> +                gain_den_short = corr_den[k][i] >> sh_gain_den;
> +
> +                tmp = MULL(gain_num_short_square, gain_den);
> +                tmp2 = MULL(gain_num_square, gain_den_short);
> +
> +                // R'(T)^2 > max R'(T)^2
> +                if(tmp > tmp2)
> +                {
> +                    gain_num = gain_num_short;
> +                    gain_den = gain_den_short;
> +                    gain_num_square = gain_num_short_square;
> +                    delayed_signal_offset = i;
> +                    best_delay_frac = k+1;
> +                }
> +            }
> +        }
> +
> +        if(!gain_num || gain_den <= 1)
> +        {
> +            gain_num = 0;
> +            break;
> +        }
> +

> +        L_temp0 = gain_num_square;
> +        L_temp1 = gain_den * ener;
> +        // temp = av_log2(L_temp0) - av_log2(L_temp1) + 1;
> +        tmp = (sh_gain_num << 1) - (sh_gain_den + sh_ener) + 1;
> +        if(tmp<0)
> +            L_temp0 >>= -tmp;
> +        else
> +            L_temp1 >>= tmp;
> +
> +        /*
> +               R'(T)^2
> +          2 * --------- < 1
> +                R(0)
> +        */
> +
> +        if(L_temp0 < L_temp1)
> +        {
> +            gain_num = 0;
> +            break;
> +        }

if the following 2 dont overflow they should be useable instead of L_temp*

(uint64_t)gain_num_square << ((sh_gain_num << 1)+1)
((uint64_t)gain_den * ener) << (sh_gain_den + sh_ener)

or maybe the uint64 isnt needed at all, either way its simpler

also i suspect that large parts of the gain related code above and below
can similarly be simplified.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080512/6c7356c8/attachment.pgp>