[FFmpeg-devel] [PATCH] Common ACELP code & G.729 [6/7] - G.729 postfilter

Vladimir Voroshilov voroshil
Tue May 13 19:25:24 CEST 2008


2008/5/12 Michael Niedermayer <michaelni at gmx.at>:
>
> On Sun, May 11, 2008 at 10:01:41PM +0700, Vladimir Voroshilov wrote:
>  > 2008/5/8 Michael Niedermayer <michaelni at gmx.at>:
>  > > On Fri, May 02, 2008 at 06:49:43PM +0700, Vladimir Voroshilov wrote:
>  > >> Patch contains G.729 postfilter.
>  > >> It was separated due to large size to help reviewing.
>  > >> G.729 can produce audible speech even without this postfilter.
>  > >>
>  > >> --
>  > >> Regards,
>  > >> Vladimir Voroshilov mailto:voroshil at gmail.com
>  > >> JID: voroshil at gmail.com, voroshil at jabber.ru
>  > >> ICQ: 95587719
>  > >
>  > >> diff --git a/libavcodec/g729postfilter.c b/libavcodec/g729postfilter.c
>  > >> new file mode 100644
>  > >> index 0000000..b09d463
>  > >> --- /dev/null
>  > >> +++ b/libavcodec/g729postfilter.c
>  > >> @@ -0,0 +1,704 @@
>  > >> +#include <inttypes.h>
>  > >> +#include <limits.h>
>  > >> +
>  > >> +#include "avcodec.h"
>  > >> +#include "g729.h"
>  > >> +#include "acelp_pitch_lag.h"
>  > >> +#include "g729postfilter.h"
>  > >> +#include "acelp_math.h"
>  > >> +#include "acelp_filters.h"
>  > >> +
>  > >> +#define FRAC_BITS 15
>  > >> +#include "mathops.h"
>  > >> +
>  > >
>
>  > >> +/**
>  > >> + * formant_pp_factor_num_pow[i] = FORMANT_PP_FACTOR_NUM^i
>  > >> + */
>  > >> +static const int16_t formant_pp_factor_num_pow[11]=
>  > >> +{
>  > >> +  /* (0.15) */
>  > >> +  32768, 18022, 9912, 5451, 2998, 1649, 907, 499, 274, 151, 83
>  > >     ^^^^^
>  > > doesnt fit in int16_t, it does fir in uint16_t though
>  >
>  > Reduced by 1
>
>  hmm ad which is correct? or is it unused?

:)
It is used in filter,
but result is ignored (due to synthesis and residual filters implementation
it is assumed to always has value 'one').
Perhaps, weighted filter should be fixed, too to leave first item of input data
unchanged (because it should be multiplied by gain^0).

>  > >> +    /* Now lp_gn (starting with 10) contains impulse response of A(z/FORMANT_PP_FACTOR_NUM)/A(z/FORMANT_PP_FACTOR_DEN) filter */
>  > >> +
>  > >
>  > >> +    /* 4.2.3, Equation 87, calcuate rh(0)  */
>  > >> +    rh0 = sum_of_squares(lp_gn + 10, 20, 0, 0) << 1;   // Q24 -> Q9
>  > >
>  > >> +    temp = 30 - av_log2(rh0);
>  > >> +    if(temp > 16)
>  > >> +        rh0 <<= temp - 16;
>  > >> +    else
>  > >> +        rh0 >>= 16 - temp;
>  > >> +
>  > > [...]
>  > >> +    if(temp > 16)
>  > >> +        rh1 <<= temp - 16;
>  > >> +    else
>  > >> +        rh1 >>= 16 - temp;
>  > >
>  > > looks useless, this just reduces precission
>  >
>  > This shift guaranties that later "(rh1<<15)/rh0" will
>  > not cause int overflow.
>
>  only the >> case can have that effect, the << is still useless

Now only >> is doing.

>  > > [...]
>  > >> +            if(exp_after - exp_before - 1 > 0)
>  > >> +                gain <<= exp_after - exp_before - 1;
>  > >> +            else
>  > >> +                gain >>= exp_before - exp_after + 1;
>  > >
>  > > This stuff is duplicated all over the place and should be in a fuction
>  >
>  > Fixed.
>  >
>  >
>  > > Anyway the whole postfilter looks like it needs some serious cleanup.
>  >
>  > Sure.
>  > Postfilter is most complicated part of entire G.729 decoder.
>  > And good cleanup is not made for it yet.
>
>  well, look at ra144.c if you want to see worse ...

Look into RCA VOC patch :P

>  Iam actually wondering if it has some common code to ACELPs. Somehow
>  it did look like it has but its quite obfuscated so i could be totally
>  wrong ...

I don't see any similar to ACELP there yet.

> > +/**
>  > + * formant_pp_factor_num_pow[i] = FORMANT_PP_FACTOR_NUM^i
>  > + */
>  > +static const int16_t formant_pp_factor_num_pow[11]=
>  > +{
>  > +  /* (0.15) */
>  > +  32767, 18022, 9912, 5451, 2998, 1649, 907, 499, 274, 151, 83
>  > +};
>  > +
>  > +/**
>  > + * formant_pp_factor_den_pow[i] = FORMANT_PP_FACTOR_DEN^i
>  > + */
>  > +static const int16_t formant_pp_factor_den_pow[11]=
>  > +{
>  > +  /* (0.15) */
>  > +  32768, 22938, 16057, 11240, 7868, 5508, 3856, 2699, 1889, 1322, 925
>  > +};
>
>  hmm 32768 ...

same as above.

>  > +/**
>  > + * \brief Residual signal calculation (4.2.1 if G.729)
>  > + * \param out [out] (Q0) output data filtered through A(z/FORMANT_PP_FACTOR_NUM)
>  > + * \param filter_coeffs (Q12) A(z/FORMANT_PP_FACTOR_NUM) filter coefficients
>  > + * \param in (Q0) input speech data to process
>  > + * \param subframe_size size of one subframe
>  > + *
>  > + * \note in buffer must contain 10 items of previous speech data before top of the buffer
>  > + * \remark It is safe to pass the same buffer for input and output.
>  > + */
>  > +static void g729_residual(
>  > +        int16_t* out,
>  > +        const int16_t* filter_coeffs,
>  > +        const int16_t* in,
>  > +        int subframe_size)
>  > +{
>  > +    int i, n;
>
> > +
>  > +    /*
>  > +      4.2.1, Equation 79 Residual signal calculation
>  > +      ( filtering through A(z/FORMANT_PP_FACTOR_NUM) , one half of short-term filter)
>  > +    */
>  > +    for(n=subframe_size-1; n>=0; n--)
>  > +    {
>  > +        int sum = 0x800;
>
> > +        for(i=0; i<10; i++)
>  > +            sum += filter_coeffs[i] * in[n-i-1];
>  > +
>  > +        out[n] = in[n] + (sum >> 12);
>  > +    }
>  > +}
>
>  can this be merged with ff_acelp_lp_synthesis_filter() ?

Just compare them.
Synthesis, implements 1/A(z) filter:
out[n] = in[n] - sum(i,1,order) { out[n-i] }

Residual, implements A(z) filter:
out[n] = in[n] + sum(i,1,order) { in[n-i] }

I thinks they should not be merged.

>  > +        corr_int_num = INT_MIN;
>  > +        best_delay_int = 0;
>  > +        for(i=pitch_delay_int-1; i<=pitch_delay_int+1; i++)
>  > +        {
>  > +            sum = FFMAX(sum_of_squares(sig_scaled - i + RES_PREV_DATA_SIZE, subframe_size, i, 0), 0);
>  > +            if(sum > corr_int_num)
>  > +            {
>  > +                corr_int_num = sum;
>  > +                best_delay_int = i;
>  > +            }
>  > +        }
>  > +        if(!corr_int_num)
>  > +            break;
>
>  i think the FFMAX is unneeded
>  with corr_int_num= 0 and best_delay_int= pitch_delay_int-1 as initial values

Fixed.

[...]

>  > +        L_temp0 = gain_num_square;
>  > +        L_temp1 = gain_den * ener;
>  > +        // temp = av_log2(L_temp0) - av_log2(L_temp1) + 1;
>  > +        tmp = (sh_gain_num << 1) - (sh_gain_den + sh_ener) + 1;
>  > +        if(tmp<0)
>  > +            L_temp0 >>= -tmp;
>  > +        else
>  > +            L_temp1 >>= tmp;
>  > +
>  > +        /*
>  > +               R'(T)^2
>  > +          2 * --------- < 1
>  > +                R(0)
>  > +        */
>  > +
>  > +        if(L_temp0 < L_temp1)
>  > +        {
>  > +            gain_num = 0;
>  > +            break;
>  > +        }
>
>  if the following 2 dont overflow they should be useable instead of L_temp*
>
>  (uint64_t)gain_num_square << ((sh_gain_num << 1)+1)
>  ((uint64_t)gain_den * ener) << (sh_gain_den + sh_ener)
>
>  or maybe the uint64 isnt needed at all, either way its simpler

even int64_t is sufficient, but uint32_t is not. Implemented with int64_t

>  also i suspect that large parts of the gain related code above and below
>  can similarly be simplified.

not so simple as above (after first try, though), but i'll try to do something.


-- 
Regards,
Vladimir Voroshilov mailto:voroshil at gmail.com
JID: voroshil at gmail.com, voroshil at jabber.ru
ICQ: 95587719
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 06_g729post_50.diff
Type: text/x-diff
Size: 27126 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080514/c3f1fdd8/attachment.diff>



More information about the ffmpeg-devel mailing list