[FFmpeg-devel] [PATCH] G.729A (floating-point) decoder and ACT demuxer
Reimar Döffinger
Reimar.Doeffinger
Sat Feb 23 19:39:43 CET 2008
> My implementation of G729A does not provide binary identical (with
> reference decoder) files on
> ITU's test vectors but i can't hear any differences between my
> and reference floating point decoder.
That's what you get for using floating-point. Exact output certainly is
not part of the bargain.
> +/**
> + * L1 codebook (10-dimensional, with 128 entries (3.2.4)
> + */
> +static const float cb_L1[128][10] = {
These constants btw. look very much like they were designed with a
16-bit fixed-point implementation in mind...
> +/**
> + * \brief rouding function from reference code
rouNding
> + * FIXME: found system replacement for it
You mean "find" instead of "found"?
> +static inline int g729_round(float f)
> +{
> + int i=ceil(f*2);
> + return i>>1;
> +}
Uh, is that just ordinary "round to nearest", which probably is the
default for most FPUs anyway?
> +/**
> + * \brief pseudo random number generator
> + */
> +static inline uint16_t g729_random(G729A_Context* ctx)
> +{
> + return ctx->rand_seed = (uint16_t)(31821 * (uint32_t)ctx->rand_seed + 13849 + ctx->rand_seed);
AFAICT rand_seed can not be negative, so why not just make rand_seed
unsigned instead of int and:
return ctx->rand_seed = 31822*ctx->rand_seed + 13849;
> +/**
> + * \brief Check parity bit (3.7.2)
> + * \param P1 Pitch delay first subframe
> + * \param P0 Parity bit for Pitch delay
> + *
> + * \return 1 if parity check is ok, 0 - otherwise
> + */
> +int g729_parity_check(int P1, int P0)
> +{
> + int P=P1>>2;
> + int S=P0&1;
> + int i;
> +
> + for(i=0; i<6; i++)
> + {
> + S ^= P&1;
> + P >>= 1;
> + }
> + S ^= 1;
> + return (!S);
> +}
I am fairly certain that we already have a function for that, since I
have seen a much less naive implementation...
That may have been in MPlayer though, so the idea is:
P1 >>= 1;
P1 = (P1 & ~1) | (P0 & 1);
P1 ^= P1 >> 4;
P1 ^= P1 >> 2;
P1 ^= P1 >> 1;
return P1 & 1;
> +/**
> + * \brief Decoding of the adaptive-codebook vector delay for first subframe (4.1.3)
> + * \param ctx private data structure
> + * \param P1 Pitch delay first subframe
> + * \param intT [out] integer part of delay
> + * \param frac [out] fractional part of delay [-1, 0, 1]
> + */
> +static void g729_decode_ac_delay_subframe1(G729A_Context* ctx, int P1, int* intT, int* frac)
> +{
> + /* if no parity error */
> + if(!ctx->bad_pitch)
> + {
> + if(P1<197)
> + {
> + *intT=1.0*(P1+2)/3+19;
> + *frac=P1-3*(*intT)+58;
> + }
> + else
> + {
> + *intT=P1-112;
> + *frac=0;
> + }
> + }
> + else{
> + *intT=ctx->intT2_prev;
> + *frac=0;
> + }
> + ctx->intT1=*intT;
Why returning the value both via ctx and intT?
> + //overflow can occure in 4.4k case
"occur", without "e"
> +/**
> + * \brief Decoding of the adaptive and fixed codebook gains from previous subframe (4.4.2)
> + * \param ctx private data structure
> + * \param gp pointer to variable receiving quantized fixed-codebook gain (gain pitch)
> + * \param gc pointer to variable receiving quantized adaptive-codebook gain (gain code)
> + */
> +static void g729_get_gain_from_previous(G729A_Context *ctx, float* gp, float* gc)
> +{
> + /* 4.4.2, Equation 93 */
> + *gc=0.98*ctx->gain_code;
> + ctx->gain_code=*gc;
> +
> + /* 4.4.2, Equation 94 */
> + *gp=FFMIN(0.9*ctx->gain_pitch, 0.9);
> + ctx->gain_pitch = *gp;
> +}
Again, why returning the value both ways? I performance is an issue I
have the feeling there are better optimization-opportunities (not to
mention that the compiler should inline this anyway).
> +static void g729_update_gain(G729A_Context *ctx)
> +{
> + float avg_gain=0;
> + int i;
> +
> + /* 4.4.3. Equation 95 */
> + for(i=0; i<4; i++)
> + avg_gain+=ctx->pred_vect_q[i];
> +
> + avg_gain = FFMAX(avg_gain * 0.25 - 4.0, -14);
> +
> + for(i=3; i>0; i--)
> + ctx->pred_vect_q[i]=ctx->pred_vect_q[i-1];
Maybe use memmove? Not so sure if it's a good idea.
> +static void g729_lp_synthesis_filter(G729A_Context *ctx, float* lp, float *in, float *out, float *filter_data)
> +{
> + float* tmp_buf=av_mallocz((10+ctx->subframe_size)*sizeof(float));
Not exactly an inner loop, but malloc here does not look too nice to me,
maybe a fixed-sized buffer in the context would do (or does it even fit
on the stack)?
> +static float g729_get_signal_gain(G729A_Context *ctx, float *speech)
> +{
> + int n;
> + float gain;
> +
> + gain=0;
> + for(n=0; n<ctx->subframe_size; n++)
> + gain+=speech[n]*speech[n];
> float gain = speech[0] * speech[0];
> for (n = 1;...
unless ctx->subframe_size can be 0.
> +static void g729a_weighted_filter(G729A_Context *ctx, float* Az, float gamma, float *Azg)
> +{
> + float gamma_tmp;
> + int n;
> +
> + gamma_tmp=gamma;
> + for(n=0; n<10; n++)
> + {
> + Azg[n]=Az[n]*gamma_tmp;
> + gamma_tmp*=gamma;
> + }
gamma_pow or so would be more descriptive than gamma_tmp.
> + float corellation, corr_max;
correlation, 2r, 1l.
> +#define CHUNK_SIZE 512
> +#define RIFF_TAG MKTAG('R','I','F','F')
> +#define WAVE_TAG MKTAG('W','A','V','E')
If this is a RIFF/WAVE format, maybe the demuxer for that should be
extended instead?
> +typedef struct{
> + uint8_t tag; //??? 0x84
> + uint16_t msec; ///< duration msec
> + uint8_t sec; ///< duration sec
> + uint32_t minutes; ///< Duration min
> +} ACTHeader;
> + ACTHeader* hdr=(ACTHeader*)&p->buf[256];
No!!! You can not read binary data into a struct with a simple case,
esp. due to endianness and alignment issues.
> + if(hdr->tag!=0x84)
Not to mention that it's overkill for a single byte to compare.
> + if (get_buffer(pb, (char*)&ctx->hdr, sizeof(ctx->hdr))!=sizeof(ctx->hdr))
Same problem.
> + if(st->codec->sample_rate!=8000 && st->codec->sample_rate!=4400)
> + {
> + av_log(s, AV_LOG_ERROR, "Sample rate %d is not supported\n", st->codec->sample_rate);
> + return -1;
> + }
Why do you think this check belongs in the demuxer and not only in the
decoder?
> +static int act_read_packet(AVFormatContext *s,
> + AVPacket *pkt)
> +{
> + ACTContext* ctx = s->priv_data;
> + ByteIOContext *pb = s->pb;
> + uint8_t bFrame[22];
> + uint8_t *pkt_buf;
> + int i, bytes_read;
> + int frame_size=s->streams[0]->codec->frame_size;
> +
> + bytes_read = get_buffer(pb, bFrame, frame_size);
> +
> + if(bytes_read != frame_size || av_new_packet(pkt, frame_size))
> + return AVERROR(EIO);
> +
> + pkt_buf=(uint16_t*)pkt->data;
> +
> + pkt_buf[1]=bFrame[0];
> + pkt_buf[3]=bFrame[1];
> + pkt_buf[5]=bFrame[2];
> + pkt_buf[7]=bFrame[3];
> + pkt_buf[9]=bFrame[4];
> + pkt_buf[0]=bFrame[5];
> + pkt_buf[2]=bFrame[6];
> + pkt_buf[4]=bFrame[7];
> + pkt_buf[6]=bFrame[8];
> + pkt_buf[8]=bFrame[9];
Why do you think the demuxer should do this conversion and not the
decoder?
It also has endianness issues, files created with -ac copy will only
work on a machine with the same endianness or something evil like that...
Greetings,
Reimar D?ffinger
More information about the ffmpeg-devel
mailing list