[FFmpeg-devel] [PATCH] AAC decoder round 8

Mon Aug 18 14:54:06 CEST 2008

On Mon, Aug 18, 2008 at 01:47:42PM +0100, Robert Swain wrote:
> 2008/8/18 Robert Swain <robert.swain at gmail.com>:
> > 2008/8/18 Robert Swain <robert.swain at gmail.com>:
> >> 2008/8/18 Robert Swain <robert.swain at gmail.com>:
> >>> 2008/8/15 Robert Swain <robert.swain at gmail.com>:
> >>>> 2008/8/15 Michael Niedermayer <michaelni at gmx.at>:
> >>>>> On Fri, Aug 15, 2008 at 09:04:24AM +0100, Robert Swain wrote:
> >>>>>> 2008/8/15 Michael Niedermayer <michaelni at gmx.at>:
> >>>>>> > On Fri, Aug 15, 2008 at 01:32:08AM +0100, Robert Swain wrote:
> >>>>>> >> $subj
> >>>>>> >>
> >>>>>> >> There's not much left to commit now! :D
> >>>>>> >
> >>>>>> > ok
> >>>>>>
> >>>>>> All committed. Just to make it easier for me and/or you to keep track
> >>>>>> of, here's another patch attached with the remaining hunks.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Rob
> >>>>>
> >>>>>> Index: libavcodec/aac.c
> >>>>>> ===================================================================
> >>>>>> --- libavcodec/aac.c  (revision 14774)
> >>>>>> +++ libavcodec/aac.c  (working copy)
> >>>>> [...]
> >>>>>> @@ -605,6 +616,44 @@
> >>>>>>  }
> >>>>>>
> >>>>>>  /**
> >>>>>> + * Decode Temporal Noise Shaping data; reference: table 4.48.
> >>>>>> + *
> >>>>>> + * @return  Returns error status. 0 - OK, !0 - error
> >>>>>> + */
> >>>>>> +static int decode_tns(AACContext * ac, TemporalNoiseShaping * tns,
> >>>>>> +        GetBitContext * gb, const IndividualChannelStream * ics) {
> >>>>>
> >>>>>> +    int w, filt, i, coef_len, coef_res = 0, coef_compress;
> >>>>>
> >>>>> useless init ?
> >>>>
> >>>> Subtle, but yes.
> >>>>
> >>>>>> +    const int is8 = ics->window_sequence[0] == EIGHT_SHORT_SEQUENCE;
> >>>>>> +    const int tns_max_order = is8 ? 7 : ac->m4ac.object_type == AOT_AAC_MAIN ? 20 : 12;
> >>>>>> +    for (w = 0; w < ics->num_windows; w++) {
> >>>>>> +        tns->n_filt[w] = get_bits(gb, 2 - is8);
> >>>>>> +
> >>>>>> +        if (tns->n_filt[w])
> >>>>>> +            coef_res = get_bits1(gb) + 3;
> >>>>>> +
> >>>>>> +        for (filt = 0; filt < tns->n_filt[w]; filt++) {
> >>>>>> +            tns->length[w][filt] = get_bits(gb, 6 - 2*is8);
> >>>>>> +
> >>>>>
> >>>>>> +            if ((tns->order[w][filt] = get_bits(gb, 5 - 2*is8)) <= tns_max_order) {
> >>>>>> +                tns->direction[w][filt] = get_bits1(gb);
> >>>>>> +                coef_compress = get_bits1(gb);
> >>>>>> +                coef_len = coef_res - coef_compress;
> >>>>>> +                tns->tmp2_map[w][filt] = tns_tmp2_map[2*coef_compress + coef_res - 3];
> >>>>>
> >>>>> the 3 can be moved to "coef_len = coef_res - coef_compress + 3"
> >>>>
> >>>> But, unless I'm missing something, that will change the behaviour of
> >>>> the code as more bits will be read in the loop you quoted just below.
> >>>>
> >>>>>> +
> >>>>>> +                for (i = 0; i < tns->order[w][filt]; i++)
> >>>>>> +                    tns->coef[w][filt][i] = get_bits(gb, coef_len);
> >>>>>
> >>>>> tns->coef is only used to index into tmp2_map thus
> >>>>> tns->coef could already contain the values from tmp2_map
> >>>>> this also would make the tmp2_map field unneeded in the struct
> >>>>
> >>>> OK, I'll look into this.
> >>>>
> >>>>>> +            } else {
> >>>>>> +                av_log(ac->avccontext, AV_LOG_ERROR, "TNS filter order %d is greater than maximum %d.",
> >>>>>> +                       tns->order[w][filt], tns_max_order);
> >>>>>> +                tns->order[w][filt] = 0;
> >>>>>> +                return -1;
> >>>>>> +            }
> >>>>>
> >>>>> if(... > tns_max_order){
> >>>>>    ...
> >>>>>    return -1
> >>>>> }
> >>>>> ...
> >>>>>
> >>>>> seems cleaner to me
> >>>>
> >>>> Done.
> >>>>
> >>>>>> +        }
> >>>>>> +    }
> >>>>>> +    return 0;
> >>>>>> +}
> >>>>>> +
> >>>>>> +/**
> >>>>>>   * Decode Mid/Side data; reference: table 4.54.
> >>>>>>   *
> >>>>>>   * @param   ms_present  Indicates mid/side stereo presence. [0] mask is all 0s;
> >>>>>
> >>>>>> @@ -1067,6 +1116,71 @@
> >>>>>>  }
> >>>>>>
> >>>>>>  /**
> >>>>>> + * Decode Temporal Noise Shaping filter coefficients and apply all-pole filters; reference: 4.6.9.3.
> >>>>>> + *
> >>>>>> + * @param   decode  1 if tool is used normally, 0 if tool is used in LTP.
> >>>>>> + * @param   coef    spectral coefficients
> >>>>>> + */
> >>>>>> +static void apply_tns(float coef[1024], TemporalNoiseShaping * tns, IndividualChannelStream * ics, int decode) {
> >>>>>> +    const int mmm = FFMIN(ics->tns_max_bands,  ics->max_sfb);
> >>>>>> +    int w, filt, m, i, ib;
> >>>>>> +    int bottom, top, order, start, end, size, inc;
> >>>>>> +    float tmp;
> >>>>>> +    float lpc[TNS_MAX_ORDER + 1], b[2 * TNS_MAX_ORDER];
> >>>>>> +
> >>>>>> +    for (w = 0; w < ics->num_windows; w++) {
> >>>>>> +        bottom = ics->num_swb;
> >>>>>> +        for (filt = 0; filt < tns->n_filt[w]; filt++) {
> >>>>>> +            top = bottom;
> >>>>>> +            bottom = FFMAX(                  0, top - tns->length[w][filt]);
> >>>>>
> >>>>>> +            order  = FFMIN(tns->order[w][filt], TNS_MAX_ORDER);
> >>>>>
> >>>>> useless?
> >>>>
> >>>> Indeed. Removed. Should the 'order' variable be removed as well or is
> >>>> accessing it faster than accessing order[][]?
> >>>>
> >>>>>> +            if (order == 0)
> >>>>>> +                continue;
> >>>>>> +
> >>>>>
> >>>>>> +            // tns_decode_coef
> >>>>>> +            lpc[0] = 1;
> >>>>>> +            for (m = 1; m <= order; m++) {
> >>>>>> +                lpc[m] = tns->tmp2_map[w][filt][tns->coef[w][filt][m - 1]];
> >>>>>> +                for (i = 1; i < m; i++)
> >>>>>> +                    b[i] = lpc[i] + lpc[m] * lpc[m-i];
> >>>>>> +                for (i = 1; i < m; i++)
> >>>>>> +                    lpc[i] = b[i];
> >>>>>> +            }
> >>>>>
> >>>>> looks a little like eval_coefs from ra144.c
> >>>>> but later is fixedpoint, so this is more a random comment than anything
> >>>>>
> >>>>>
> >>>>>> +
> >>>>>> +            start = ics->swb_offset[FFMIN(bottom, mmm)];
> >>>>>> +            end   = ics->swb_offset[FFMIN(   top, mmm)];
> >>>>>> +            if ((size = end - start) <= 0)
> >>>>>> +                continue;
> >>>>>> +            if (tns->direction[w][filt]) {
> >>>>>> +                inc = -1; start = end - 1;
> >>>>>> +            } else {
> >>>>>> +                inc = 1;
> >>>>>> +            }
> >>>>>> +            start += w * 128;
> >>>>>> +
> >>>>>> +            // ar filter
> >>>>>> +            memset(b, 0, sizeof(b));
> >>>>>> +            ib = 0;
> >>>>>
> >>>>>> +            for (m = 0; m < size; m++) {
> >>>>>> +                tmp = coef[start];
> >>>>>> +                if (decode) {
> >>>>>> +                    for (i = 0; i < order; i++)
> >>>>>> +                        tmp -= b[ib + i] * lpc[i + 1];
> >>>>>> +                } else { // encode
> >>>>>> +                    for (i = 0; i < order; i++)
> >>>>>> +                        tmp += b[i]      * lpc[i + 1];
> >>>>>> +                }
> >>>>>> +                if (--ib < 0)
> >>>>>> +                    ib = order - 1;
> >>>>>> +                b[ib] = b[ib + order] = tmp;
> >>>>>> +                coef[start] = tmp;
> >>>>>> +                start += inc;
> >>>>>> +            }
> >>>>>
> >>>>> decode is always 1
> >>>>
> >>>> This is left from LTP. I'll remove it when submitting next.
> >>>>
> >>>>> b is not truly needed, coef[] can be used i its place
> >>>>> also this is likely relevant to overal codec speed so it
> >>>>> should be written more with speed than compactness in mind
> >>>>
> >>>> Does that mean you want me to rewrite it?
> >>>
> >>> Without the encode case it would look like this, I think:
> >>>
> >>> memset(b, 0, sizeof(b));
> >>> ib = 0;
> >>> for (m = 0; m < size; m++) {
> >>>    tmp = coef[start];
> >>>    for (i = 0; i < order; i++)
> >>>        tmp -= b[i] * lpc[i + 1];
> >>>    if (--ib < 0)
> >>>        ib = order - 1;
> >>>    coef[start] = b[ib] = tmp;
> >>>    start += inc;
> >>> }
> >>>
> >>> Then I don't think tmp is needed or b as none of the values of coeff
> >>> but the one currently being filtered are changed and that one is only
> >>> subtracted from, it's not involved in any calculations, so I think the
> >>> following is correct:
> >>>
> >>> for (m = 0; m < size; m++) {
> >>>    for (i = 0; i < FFMIN(m, order); i++)
> >>>        coef[start] -= coef[start - 1 - i] * lpc[i + 1];
> >>>    start += inc;
> >>> }
> >>
> >> Make that:
> >>
> >> for (m = 0; m < size; m++) {
> >>   for (i = 1; i <= FFMIN(m, order); i++)
> >>       coef[start] -= coef[start - i] * lpc[i];
> >>   start += inc;
> >> }
> >
> > Or, if you like, the start incrementation can be merged into the outer
> > for() loop's incrementation stuff. What is that bit technically
> > called? The m++ section.
> 
> See attached. Tested with the zodiac clip which is known to use TNS
> and produces identical PSNR, so I'm assuming identical output with no
> known regressions.
[...]
> +            for (m = 0; m < size; m++, start += inc)
> +                for (i = 1; i <= FFMIN(m, order); i++)
> +                    coef[start] -= coef[start - i] * lpc[i];
>          }

are you sure this is correct for both values of inc ?
i mean one will have coef[start - i] unfiltered and one filtered as far as
i can see ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080818/da20916a/attachment.pgp>