[FFmpeg-devel] [PATCH] G.723.1 Decoder

Sat Oct 9 20:25:14 CEST 2010

Hi,

On Sat, Oct 9, 2010 at 1:56 PM, Mohamed Naufal <naufal11 at gmail.com> wrote:
> On 1 October 2010 01:24, Mohamed Naufal <naufal11 at gmail.com> wrote:
[..]
> +int ff_dot_product(const int16_t *a, const int16_t *b, int length, int shift)
> +{
> +    int i, sum = 0;
> +
> +    for (i = 0; i < length; i++) {
> +        int64_t prod = av_clipl_int32(MUL64(a[i], b[i]) << shift);
> +        sum = av_clipl_int32(sum + prod);
> +    }
> +    return sum;
> +}

Should this be clipped after every iteration? That is really very
implementation-specific. Assuming this is correct, at the very least
we want to document this in the doxy of the function ("the
intermediate sum value is clipped to int32_t after addition of the
product of a value of both arrays"), because this isn't quite what
you'd expect from a product function (I'd expect the intermediate to
be int64_t and it to clip at the end), and you probably want to
document this in the function name itself as well (e.g. rename it to
ff_dot_product32).

Other than that, patch #1/#2 are OK and can be applied once this is
fixed. Doc patch is of course also OK once the rest is in.

Data tables:
> +typedef struct {
> +    int ad_cb_lag;     ///< adaptive codebook lag
> +    int ad_cb_gain;
> +    int dirac_train;
> +    int pulse_sign;
> +    int grid_index;
> +    int amp_index;
> +    int pulse_pos;
> +} G723_1_Subframe;

Please document each value here like you do for the first one, a few
words is enough. I don't mind undocumented tables but structs have to
be documented.

Decoder:

> +typedef struct g723_1_context {
> +    G723_1_Subframe subframe[4];
> +    FrameType cur_frame_type;
> +    FrameType past_frame_type;
> +    Rate cur_rate;
> +    uint8_t lsp_index[LSP_BANDS];
> +    int pitch_lag[2];
> +    int erased_frames;
> +
> +    int16_t prev_lsp[LPC_ORDER];
> +    int16_t prev_excitation[PITCH_MAX];
> +    int16_t excitation[PITCH_MAX + FRAME_LEN];
> +    int16_t synth_mem[LPC_ORDER];
> +    int16_t fir_mem[LPC_ORDER];
> +    int     iir_mem[LPC_ORDER];
> +
> +    int random_seed;
> +    int interp_index;
> +    int interp_gain;
> +    int sid_gain;
> +    int cur_gain;
> +    int reflection_coef;
> +    int pf_gain;                 ///< formant postfilter
> +                                 ///< gain scaling unit memory
> +} G723_1_Context;

Same here, please document the ones that have no doxy, a few words is
enough for most of them.

> +static av_cold int g723_1_decode_init(AVCodecContext *avctx)
> +{
> +    G723_1_Context *p  = avctx->priv_data;
> +
> +    avctx->sample_fmt  = SAMPLE_FMT_S16;
> +    p->pf_gain         = 1 << 12;
> +    memcpy(p->prev_lsp, dc_lsp, LPC_ORDER * sizeof(int16_t));

Please add a .flush function (see wmavoice.c for an example) to reset
these values after a seek, otherwise you'll get some weird transition
after each seek.

> +        memset(out, 0, *data_size);
> +        av_log(avctx, AV_LOG_WARNING,
> +               "G.723.1: Comfort noise generation not supported yet\n");
> +        return frame_size[dec_mode];

av_log_missing_feature(). Will you implement this in the future, is
there a reason this wasn't implemented yet? (I'm not asking you to
implement it, just wondering...)

> +    formant_postfilter(p, lpc, out);
> +
> +    memmove(out, out + LPC_ORDER, *data_size);

By outputting into an intermediate buffer before the PF, you can
probably integrate the memmove() inside the postfilter (e.g. as part
of the "compensation filter" step), which then means you need a little
more memory, but don't need the memmove() anymore.

I had already reviewed most of the decoder itself before, so it looks
quite good to me. Good work!

Ronald