[FFmpeg-devel] [PATCH] HE-AACv1 second revision

Måns Rullgård mans
Sun Jan 31 00:52:06 CET 2010


Alex Converse <alex.converse at gmail.com> writes:

> Notes:
> *There are still several lroundf() calls that take all integer inputs.
> If anyone has advice on how to do them in an integer only fashion. I
> would love to hear it.
> *There is a brand new filterbank. The analysis filterbank is based on
> the FFT. The synthesis filterbank is based on the MDCT. I also have a
> synthesis filterbank based on an FFT but is slower than the MDCT
> filterbank. I think if we had a generic twiddle and permute DSP
> function it could be faster. The new filterbanks still do not support
> ff_float_to_int16_interleave_c.

Most of my comments from the first version are still standing.

> +/**
> + * Analysis QMF Bank (14496-3 sp04 p206)
> + *
> + * @param   x       pointer to the beginning of the first sample window
> + * @param   W       array of complex-valued samples split into subbands
> + */
> +static void sbr_qmf_analysis(DSPContext *dsp, FFTContext *fft, const float *in, float *x,
> +                             FFTComplex u[64], float W[2][32][32][2])
> +{
> +    int i, k, l;
> +    const uint16_t *revtab = fft->revtab;
> +    memcpy(W[0], W[1], sizeof(W[0]));
> +    memcpy(x    , x+1024, (320-32)*sizeof(x[0]));
> +    memcpy(x+288, in    ,     1024*sizeof(x[0]));
> +    x += 319;
> +    for (l = 0; l < 32; l++) { // numTimeSlots*RATE = 16*2 as 960 sample frames
> +                               // are not supported
> +        float z[320];
> +        for (i = 0; i < 320; i++)
> +            z[i] = x[-i] * sbr_qmf_window_ds[i];

vector_fmul_reverse()

> +        for (i = 0; i < 64; i++) {
> +            float f = z[i] + z[i + 64] + z[i + 128] + z[i + 192] + z[i + 256];
> +            u[revtab[i]].re = f * analysis_cos_pre[i];
> +            u[revtab[i]].im = f * analysis_sin_pre[i];
> +        }

SIMD should help here despite the permutation.  It can probably be
combined with the z[] calculation too.

> +        ff_fft_calc(fft, u);
> +        for (k = 0; k < 32; k++) {
> +            W[1][k][l][0] = u[k].re * analysis_cos_post[k] - u[k].im * analysis_sin_post[k];
> +            W[1][k][l][1] = u[k].re * analysis_sin_post[k] + u[k].im * analysis_cos_post[k];
> +        }

SIMD

> +        x += 32;
> +    }
> +}
> +
> +/**
> + * Synthesis QMF Bank (14496-3 sp04 p206) and Downsampled Synthesis QMF Bank
> + * (14496-3 sp04 p206)
> + */
> +static void sbr_qmf_synthesis(DSPContext *dsp, FFTContext *mdct,
> +                              float *out, float X[2][32][64],
> +                              float mdct_buf[2][64],
> +                              float *v, const unsigned int div)
> +{
> +    int l, n;
> +    const float *sbr_qmf_window = div ? sbr_qmf_window_ds : sbr_qmf_window_us;
> +    for (l = 0; l < 32; l++) {
> +        memmove(&v[128 >> div], v, ((1280 - 128) >> div) * sizeof(float));
> +        for (n = 1; n < 64 >> div; n+=2) {
> +            X[1][l][n] = -X[1][l][n];
> +        }

SIMD

> +        if (div) {
> +            memset(X[0][l]+32, 0, 32*sizeof(float));
> +            memset(X[1][l]+32, 0, 32*sizeof(float));
> +        }
> +        ff_imdct_half(mdct, mdct_buf[0], X[0][l]);
> +        ff_imdct_half(mdct, mdct_buf[1], X[1][l]);
> +        for (n = 0; n < 64 >> div; n++) {
> +            v[               n] = -mdct_buf[0][64-1-(n<<div)    ] + mdct_buf[1][ n<<div     ];
> +            v[(128>>div)-1 - n] =  mdct_buf[0][64-1-(n<<div)-div] + mdct_buf[1][(n<<div)+div];
> +        }

SIMD

> +        dsp->vector_fmul_add(out, v                , sbr_qmf_window               , zero64, 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 192 >> div), sbr_qmf_window + ( 64 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 256 >> div), sbr_qmf_window + (128 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 448 >> div), sbr_qmf_window + (192 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 512 >> div), sbr_qmf_window + (256 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 704 >> div), sbr_qmf_window + (320 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 768 >> div), sbr_qmf_window + (384 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + ( 960 >> div), sbr_qmf_window + (448 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + (1024 >> div), sbr_qmf_window + (512 >> div), out   , 64 >> div);
> +        dsp->vector_fmul_add(out, v + (1216 >> div), sbr_qmf_window + (576 >> div), out   , 64 >> div);
> +        out += 64 >> div;
> +    }
> +}

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list