[FFmpeg-devel] [PATCH] G.729A (floating-point) decoder and ACT demuxer

Sat Feb 23 19:39:43 CET 2008

> My implementation of G729A does not provide binary identical (with
> reference decoder) files on
> ITU's test vectors but i can't hear any differences between my
> and reference floating point decoder.

That's what you get for using floating-point. Exact output certainly is
not part of the bargain.

> +/**
> + * L1 codebook (10-dimensional, with 128 entries (3.2.4)
> + */
> +static const float cb_L1[128][10] = {

These constants btw. look very much like they were designed with a
16-bit fixed-point implementation in mind...

> +/**
> + * \brief rouding function from reference code

rouNding

> + * FIXME: found system replacement for it

You mean "find" instead of "found"?

> +static inline int g729_round(float f)
> +{
> +    int i=ceil(f*2);
> +    return i>>1;
> +}

Uh, is that just ordinary "round to nearest", which probably is the
default for most FPUs anyway?

> +/**
> + * \brief pseudo random number generator
> + */
> +static inline uint16_t g729_random(G729A_Context* ctx)
> +{
> +    return ctx->rand_seed = (uint16_t)(31821 * (uint32_t)ctx->rand_seed + 13849 + ctx->rand_seed);

AFAICT rand_seed can not be negative, so why not just make rand_seed
unsigned instead of int and:
return ctx->rand_seed = 31822*ctx->rand_seed + 13849;

> +/**
> + * \brief Check parity bit (3.7.2)
> + * \param P1 Pitch delay first subframe
> + * \param P0 Parity bit for Pitch delay
> + *
> + * \return 1 if parity check is ok, 0 - otherwise
> + */
> +int g729_parity_check(int P1, int P0)
> +{
> +    int P=P1>>2;
> +    int S=P0&1;
> +    int i;
> +
> +    for(i=0; i<6; i++)
> +    {
> +        S ^= P&1;
> +        P >>= 1;
> +    }
> +    S ^= 1;
> +    return (!S);
> +}

I am fairly certain that we already have a function for that, since I
have seen a much less naive implementation...
That may have been in MPlayer though, so the idea is:
P1 >>= 1;
P1 = (P1 & ~1) | (P0 & 1);
P1 ^= P1 >> 4;
P1 ^= P1 >> 2;
P1 ^= P1 >> 1;
return P1 & 1;

> +/**
> + * \brief Decoding of the adaptive-codebook vector delay for first subframe (4.1.3)
> + * \param ctx private data structure
> + * \param P1 Pitch delay first subframe
> + * \param intT [out] integer part of delay
> + * \param frac [out] fractional part of delay [-1, 0, 1]
> + */
> +static void g729_decode_ac_delay_subframe1(G729A_Context* ctx, int P1, int* intT, int* frac)
> +{
> +    /* if no parity error */
> +    if(!ctx->bad_pitch)
> +    {
> +        if(P1<197)
> +        {
> +            *intT=1.0*(P1+2)/3+19;
> +            *frac=P1-3*(*intT)+58;
> +        }
> +        else
> +        {
> +            *intT=P1-112;
> +            *frac=0;
> +        }
> +    }
> +    else{
> +        *intT=ctx->intT2_prev;
> +        *frac=0;
> +    }
> +    ctx->intT1=*intT;

Why returning the value both via ctx and intT?

> +        //overflow can occure in 4.4k case

"occur", without "e"

> +/**
> + * \brief Decoding of the adaptive and fixed codebook gains from previous subframe (4.4.2)
> + * \param ctx private data structure
> + * \param gp pointer to variable receiving quantized fixed-codebook gain (gain pitch)
> + * \param gc pointer to variable receiving quantized adaptive-codebook gain (gain code)
> + */
> +static void g729_get_gain_from_previous(G729A_Context *ctx, float* gp, float* gc)
> +{
> +    /* 4.4.2, Equation 93 */
> +    *gc=0.98*ctx->gain_code;
> +    ctx->gain_code=*gc;
> +
> +    /* 4.4.2, Equation 94 */
> +    *gp=FFMIN(0.9*ctx->gain_pitch, 0.9);
> +    ctx->gain_pitch = *gp;
> +}

Again, why returning the value both ways? I performance is an issue I
have the feeling there are better optimization-opportunities (not to
mention that the compiler should inline this anyway).

> +static void g729_update_gain(G729A_Context *ctx)
> +{
> +    float avg_gain=0;
> +    int i;
> +
> +    /* 4.4.3. Equation 95 */
> +    for(i=0; i<4; i++)
> +        avg_gain+=ctx->pred_vect_q[i];
> +
> +    avg_gain = FFMAX(avg_gain * 0.25 - 4.0, -14);
> +
> +    for(i=3; i>0; i--)
> +        ctx->pred_vect_q[i]=ctx->pred_vect_q[i-1];

Maybe use memmove? Not so sure if it's a good idea.

> +static void g729_lp_synthesis_filter(G729A_Context *ctx, float* lp, float *in, float *out, float *filter_data)
> +{
> +    float* tmp_buf=av_mallocz((10+ctx->subframe_size)*sizeof(float));

Not exactly an inner loop, but malloc here does not look too nice to me,
maybe a fixed-sized buffer in the context would do (or does it even fit
on the stack)?

> +static float g729_get_signal_gain(G729A_Context *ctx, float *speech)
> +{
> +    int n;
> +    float gain;
> +
> +    gain=0;
> +    for(n=0; n<ctx->subframe_size; n++)
> +       gain+=speech[n]*speech[n];

> float gain = speech[0] * speech[0];
> for (n = 1;...

unless ctx->subframe_size can be 0.

> +static void g729a_weighted_filter(G729A_Context *ctx, float* Az, float gamma, float *Azg)
> +{
> +    float gamma_tmp;
> +    int n;
> +
> +    gamma_tmp=gamma;
> +    for(n=0; n<10; n++)
> +    {
> +        Azg[n]=Az[n]*gamma_tmp;
> +        gamma_tmp*=gamma;
> +    }

gamma_pow or so would be more descriptive than gamma_tmp.

> +    float corellation, corr_max;

correlation, 2r, 1l.

> +#define CHUNK_SIZE 512
> +#define RIFF_TAG MKTAG('R','I','F','F')
> +#define WAVE_TAG MKTAG('W','A','V','E')

If this is a RIFF/WAVE format, maybe the demuxer for that should be
extended instead?

> +typedef  struct{
> +    uint8_t tag;        //??? 0x84
> +    uint16_t msec;      ///< duration msec
> +    uint8_t  sec;       ///< duration sec
> +    uint32_t minutes;   ///< Duration min
> +} ACTHeader;

> +    ACTHeader* hdr=(ACTHeader*)&p->buf[256];

No!!! You can not read binary data into a struct with a simple case,
esp. due to endianness and alignment issues.

> +    if(hdr->tag!=0x84)

Not to mention that it's overkill for a single byte to compare.

> +    if (get_buffer(pb, (char*)&ctx->hdr, sizeof(ctx->hdr))!=sizeof(ctx->hdr))

Same problem.

> +    if(st->codec->sample_rate!=8000 && st->codec->sample_rate!=4400)
> +    {
> +        av_log(s, AV_LOG_ERROR, "Sample rate %d is not supported\n", st->codec->sample_rate);
> +        return -1;
> +    }

Why do you think this check belongs in the demuxer and not only in the
decoder?

> +static int act_read_packet(AVFormatContext *s,
> +                          AVPacket *pkt)
> +{
> +    ACTContext* ctx = s->priv_data;
> +    ByteIOContext *pb = s->pb;
> +    uint8_t bFrame[22];
> +    uint8_t *pkt_buf;
> +    int i, bytes_read;
> +    int frame_size=s->streams[0]->codec->frame_size;
> +    
> +    bytes_read = get_buffer(pb, bFrame, frame_size);
> +
> +    if(bytes_read != frame_size || av_new_packet(pkt, frame_size))
> +        return AVERROR(EIO);
> +
> +    pkt_buf=(uint16_t*)pkt->data;
> +
> +    pkt_buf[1]=bFrame[0];
> +    pkt_buf[3]=bFrame[1];
> +    pkt_buf[5]=bFrame[2];
> +    pkt_buf[7]=bFrame[3];
> +    pkt_buf[9]=bFrame[4];
> +    pkt_buf[0]=bFrame[5];
> +    pkt_buf[2]=bFrame[6];
> +    pkt_buf[4]=bFrame[7];
> +    pkt_buf[6]=bFrame[8];
> +    pkt_buf[8]=bFrame[9];

Why do you think the demuxer should do this conversion and not the
decoder?
It also has endianness issues, files created with -ac copy will only
work on a machine with the same endianness or something evil like that...

Greetings,
Reimar D?ffinger