[FFmpeg-devel] [PATCH] E-AC-3 spectral extension

Michael Niedermayer michaelni
Thu Jun 4 03:47:38 CEST 2009


On Tue, Jun 02, 2009 at 09:19:23PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> > On Sun, May 17, 2009 at 02:23:34PM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> I was recently made aware that some French TV station(s) will soon (if
> >> not already) start using E-AC-3 streams in their broadcasts which
> >> utilize spectral extension.  I was also given some samples (thanks j-b
> >> and Anthony), which I uploaded to mphq:
> >> http://samples.mplayerhq.hu/A-codecs/AC3/eac3/csi_miami_*
> >>
> >> So I decided to revisit my SPX patch.  The previous version was done
> >> with all integer arithmetic, but it turns out that it's really not
> >> accurate enough for spectal extension processing.  The resulting decoded
> >> output had a max bandwidth of about 2kHz less when using 24-bit fixed
> >> point vs. floating point, and was only slightly higher than without any
> >> SPX processing at all.  Making just the square roots floating point
> >> raised the bandwidth about 1kHz, and making the rest (noise/signal
> >> scaling, spx coords, and notch filters) floating point added about
> >> another 1kHz.
> >> [...]
> >> +            decode_band_structure(gbc, blk, s->eac3, 0,
> >> +                                  s->spx_start_subband, spx_end_subband,
> >> +                                  ff_eac3_default_spx_band_struct,
> >> +                                  s->spx_band_struct, &s->num_spx_bands,
> >> +                                  s->spx_band_sizes);
> >> +        } else {
> >> +            for (ch = 1; ch <= fbw_channels; ch++) {
> >> +                s->channel_in_spx[ch] = 0;
> >> +                s->first_spx_coords[ch] = 1;
> >> +            }
> >>          }
> >> -        /* TODO: parse spectral extension strategy info */
> >>      }
> >>  
> >> -    /* TODO: spectral extension coordinates */
> >> +    /* spectral extension coordinates */
> >> +    if (s->spx_in_use) {
> >> +        for (ch = 1; ch <= fbw_channels; ch++) {
> >> +            if (s->channel_in_spx[ch]) {
> >> +                if (s->first_spx_coords[ch] || get_bits1(gbc)) {
> >> +                    int bin;
> >> +                    float spx_blend;
> >> +                    int master_spx_coord;
> >> +                    s->first_spx_coords[ch] = 0;
> >> +                    spx_blend = get_bits(gbc, 5) / 32.0f;
> >> +                    master_spx_coord = get_bits(gbc, 2) * 3;
> >> +                    bin = s->spx_start_freq;
> >> +                    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> >> +                        int bandsize;
> >> +                        int spx_coord_exp, spx_coord_mant;
> >> +                        float nratio, sblend, nblend, spx_coord;
> >> +
> >> +                        /* calculate blending factors */
> >> +                        bandsize = s->spx_band_sizes[bnd];
> >> +                        nratio = ((float)((bin + (bandsize >> 1))) / s->spx_end_freq) - spx_blend;
> >> +                        nratio = av_clipf(nratio, 0.0f, 1.0f);
> > 
> >> +                        nblend = sqrt(       nratio);
> >> +                        sblend = sqrt(1.0f - nratio);
> >> +                        nblend *= 1.73205077648f; // scale noise to give unity variance
> > 
> > nblend = sqrt( 3*nratio);
> 
> fixed.  also used *0.03125f instead of /32.0f for spx_blend.
> 
> > 
> >> +                        bin += bandsize;
> >> +
> >> +                        /* decode spx coordinates */
> >> +                        spx_coord_exp  = get_bits(gbc, 4);
> >> +                        spx_coord_mant = get_bits(gbc, 2);
> > 
> >> +                        if (spx_coord_exp == 15)
> >> +                            spx_coord = spx_coord_mant / 4.0f;
> >> +                        else
> >> +                            spx_coord = (spx_coord_mant + 4) / 8.0f;
> > 
> > multiply is faster then divide
> 
> fixed and also merged the *32.0 with the /4.0 and /8.0.
> 
> > 
> >> +                        spx_coord /= (float)(1 << (spx_coord_exp + master_spx_coord));
> > 
> > the float cast looks useles
> 
> fixed here and a couple other places.
> 
> > 
> > [...]
> >> +        /* Copy coeffs from normal bands to extension bands */
> >> +        bin = s->spx_start_freq;
> >> +        for (i = 0; i < num_copy_sections; i++) {
> >> +            memcpy(&s->transform_coeffs[ch][bin],
> >> +                   &s->transform_coeffs[ch][s->spx_copy_start_freq],
> >> +                   copy_sizes[i]*sizeof(float));
> >> +            bin += copy_sizes[i];
> >> +        }
> > 
> > cant that memcpy be merged with some of the other processing?
> 
> I thought I might be able to, but no.  Because of the rules of how the
> copying is done, it makes it more efficient to do it this way.  The copy
> band is a multiple of 12 like the spx bands, but not necessarily the
> same size, and the way that it wraps around would make it awkward to mix
> with the other purely per-band calculations.  This way the wrapping
> boundaries are calculated once for all channels and are separate from
> the spx band structure.
> 
> I did try merging the copying and energy calculation without memcpy but
> it was slower and much uglier.
> 
> > 
> >> +
> >> +        /* Calculate RMS energy for each SPX band. */
> >> +        bin = s->spx_start_freq;
> >> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> >> +            int bandsize = s->spx_band_sizes[bnd];
> >> +            float accum = 0.0f;
> >> +            for (i = 0; i < bandsize; i++) {
> >> +                float coeff = s->transform_coeffs[ch][bin++];
> >> +                accum += coeff * coeff;
> >> +            }
> >> +            rms_energy[bnd] = sqrt(accum / (float)bandsize);
> >> +        }
> >> +
> >> +        /* Apply a notch filter at transitions between normal and extension
> >> +           bands and at all wrap points. */
> >> +        if (s->spx_atten_code[ch] >= 0) {
> >> +            const float *atten_tab = ff_eac3_spx_atten_tab[s->spx_atten_code[ch]];
> >> +            bin = s->spx_start_freq - 2;
> >> +            for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> >> +                if (wrapflag[bnd]) {
> >> +                    float *coeffs = &s->transform_coeffs[ch][bin];
> >> +                    coeffs[0] *= atten_tab[0];
> >> +                    coeffs[1] *= atten_tab[1];
> >> +                    coeffs[2] *= atten_tab[2];
> >> +                    coeffs[3] *= atten_tab[1];
> >> +                    coeffs[4] *= atten_tab[0];
> >> +                }
> >> +                bin += s->spx_band_sizes[bnd];
> >> +            }
> >> +        }
> >> +
> >> +        /* Apply noise-blended coefficient scaling based on previously
> >> +           calculated RMS energy, blending factors, and SPX coordinates for
> >> +           each band. */
> >> +        bin = s->spx_start_freq;
> >> +        for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> >> +            float nscale = s->spx_noise_blend[ch][bnd] * rms_energy[bnd];
> >> +            float sscale = s->spx_signal_blend[ch][bnd];
> >> +            for (i = 0; i < s->spx_band_sizes[bnd]; i++) {
> >> +                float noise  = nscale * (((int)av_lfg_get(&s->dith_state))/(float)(1<<31));
> > 
> > the 1<<31 factor can be merged into nscale
> 
> fixed. and the function is now 10% faster.
> 
> 
> New patch attached.
[...]
> -    /* TODO: spectral extension coordinates */
> +    /* spectral extension coordinates */
> +    if (s->spx_in_use) {
> +        for (ch = 1; ch <= fbw_channels; ch++) {
> +            if (s->channel_in_spx[ch]) {
> +                if (s->first_spx_coords[ch] || get_bits1(gbc)) {
> +                    int bin;
> +                    float spx_blend;
> +                    int master_spx_coord;
> +                    s->first_spx_coords[ch] = 0;
> +                    spx_blend = get_bits(gbc, 5) * 0.03125f;
> +                    master_spx_coord = get_bits(gbc, 2) * 3;
> +                    bin = s->spx_start_freq;

an empty line in there somewhere would improve readability IMHO


> +                    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +                        int bandsize;
> +                        int spx_coord_exp, spx_coord_mant;
> +                        float nratio, sblend, nblend, spx_coord;
> +
> +                        /* calculate blending factors */
> +                        bandsize = s->spx_band_sizes[bnd];
> +                        nratio = ((float)((bin + (bandsize >> 1))) / s->spx_end_freq) - spx_blend;
> +                        nratio = av_clipf(nratio, 0.0f, 1.0f);
> +                        nblend = sqrt(3.0f * nratio); // noise is scaled by sqrt(3) to give unity variance
> +                        sblend = sqrt(1.0f - nratio);
> +                        bin += bandsize;
> +

> +                        /* decode spx coordinates */
> +                        spx_coord_exp  = get_bits(gbc, 4);
> +                        spx_coord_mant = get_bits(gbc, 2);
> +                        if (spx_coord_exp == 15)
> +                            spx_coord = spx_coord_mant * 8.0f;
> +                        else
> +                            spx_coord = (spx_coord_mant + 4) * 4.0f;
> +                        spx_coord /= 1 << (spx_coord_exp + master_spx_coord);

something based on the following would avoid the /
spx_coord *= (1<<123) >> (spx_coord_exp + master_spx_coord)

also *4 can be factored out of the if/else and into the factor above


[...]
> @@ -66,6 +62,96 @@ typedef enum {
>  
>  #define EAC3_SR_CODE_REDUCED  3
>  
> +void ff_eac3_apply_spectral_extension(AC3DecodeContext *s)
> +{
> +    int bin, bnd, ch, i;

> +    uint8_t wrapflag[SPX_MAX_BANDS]={0,}, num_copy_sections, copy_sizes[SPX_MAX_BANDS];
> +    float rms_energy[SPX_MAX_BANDS];
> +
> +    /* Set copy index mapping table. Set wrap flags to apply a notch filter at
> +       wrap points later on. */
> +    wrapflag[0] = 1;

double initialization


> +    bin = s->spx_copy_start_freq;
> +    num_copy_sections = 0;
> +    for (bnd = 0; bnd < s->num_spx_bands; bnd++) {
> +        int copysize;
> +        int bandsize = s->spx_band_sizes[bnd];

> +        if ((bin + bandsize) > s->spx_start_freq) {

redundant ()

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No great genius has ever existed without some touch of madness. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090604/618ceea2/attachment.pgp>



More information about the ffmpeg-devel mailing list