[FFmpeg-devel] [PATCH] WMA Voice decoder

Tue Feb 9 19:21:31 CET 2010

Ronald S. Bultje wrote:
> Hi,
> 
> some more:
> 
> On Sat, Feb 6, 2010 at 11:19 AM, Vitor Sessak <vitor1001 at gmail.com> wrote:
>> Ronald S. Bultje wrote:
>>> +static void dequant_lsp10i(GetBitContext *gb, double *lsps)
>>> +{
>>> +    static const uint16_t vec_sizes[4] = { 256, 64, 32, 32 };
>>> +    static const double mul_lsf[4] = {
>>> +        5.2187144800e-3,    1.4626986422e-3,
>>> +        9.6179549166e-4,    1.1325736225e-3
>>> +    };
>>> +    static const double base_lsf[4] = {
>>> +        M_PI * -2.15522e-1, M_PI * -6.1646e-2,
>>> +        M_PI * -3.3486e-2,  M_PI * -5.7408e-2
>>> +    };
>>>
>> I think just putting the values multiplied by pi would be more readable.
> 
> Reason I did it this way (here, at least) is that the values are
> actually exact (as far as I can see), so this is the most exact
> representation I can give of the number. Multiplied by pi, it'd always
> be rounded to something depending on how I type it.
> 
> If that's a big thing, I'll change it, I just figured I'd explain the
> reason why I did it this way.

No big opinion on my part also, it is just that the compiler will 
truncate it to double anyway.

>>> +#define NO_OFFSET -255
> [..]
>>> +    if (n < 11)
>>> +        memset(&off_table[n], -1, (11 - n) * sizeof(int));
>> -1 or NO_OFFSET?
> 
> I changed the code so in this particular function, any value < 0 is
> invalid (later, I check if (table_off[x] >= 0), so it's the same. Only
> for WMAVoiceContext->aw_first_pulse_off[0-1], NO_OFFSET is a specific
> value (since negative offsets can be OK here).

I don't really like neither (negative values == unused or NO_OFFSET == 
unused), but in this case I see no obvious way of avoiding it.

>>> +    s->aw_first_pulse_off[0] = s->aw_first_pulse_off[1] = NO_OFFSET;
>> Is this initialization really needed?
> 
> Yes, since each of the two blocks can have no values at all (if
> aw_n_pulses[0-1] == 0). That happens occasionally. I could move it to
> the end under if (aw_n_pulses[] == 0), but thought that was ugly.

Yes, then it is better as it is now.

>>> +    switch (frame_descs[bd_idx].acb_type) {
>>> +    case ACB_TYPE_NONE:
>>> +        memset(pitch, 0, sizeof(pitch[0]) *
>>> frame_descs[bd_idx].n_blocks);
>>>
>> If it is unused, is it really needed to waste time zero'ing it out?
> 
> It prevents a valgrind warning because I specify it as an (in this
> case unused) function argument to synth_block(). One way around that,
> which you use in sipro as well, is to create a WMAFrameParams struct,
> but I didn't do that yet. That could also house stuff that doesn't
> need to be cached between frames but is currently in WMAVoiceContext,
> such as silence_gain. Shall I do that instead?

No, I like the code as it is now, but one thing:

>     /*
>      * Pitch (per ACB type):
>      * - type 0: unused
>      * - type 1: provided (globally) for the whole frame. In #synth_block(),
>      *            we derive the "pitch-per-sample" for adaptive codebook
>      *            reading.
>      * - type 2: provided per block (see just before the call to
>      *            #synth_block()), so not read here.
>      */
>     switch (frame_descs[bd_idx].acb_type) {
>     case ACB_TYPE_NONE:
>         memset(pitch, 0, sizeof(pitch[0]) * frame_descs[bd_idx].n_blocks);
>         break;
>     case ACB_TYPE_ASYMMETRIC:
>         n_blocks_x2      = frame_descs[bd_idx].n_blocks << 1;
>         log_n_blocks_x2  = frame_descs[bd_idx].log_n_blocks + 1;
>         cur_pitch_val    = s->min_pitch_val + get_bits(gb, s->pitch_nbits);
>         if (s->last_acb_type == ACB_TYPE_NONE ||
>             20 * abs(cur_pitch_val - s->last_pitch_val) >
>                 (cur_pitch_val + s->last_pitch_val))
>             s->last_pitch_val = cur_pitch_val;
> 
>         /* pitch per frame/block */
>         for (n = 0; n < frame_descs[bd_idx].n_blocks; n++) {
>             int fac = n * 2 + 1;
> 
>             pitch[n] = (MUL16(fac,                 cur_pitch_val) +
>                         MUL16((n_blocks_x2 - fac), s->last_pitch_val) +
>                         frame_descs[bd_idx].n_blocks) >> log_n_blocks_x2;
>         }

Here you calculate pitch[] for ACB_TYPE_NONE and ACB_TYPE_ASYMMETRIC

>     for (n = 0; n < frame_descs[bd_idx].n_blocks; n++) {
>         int bl_pitch_sh2 = pitch[n] << 2;
> 
>         /*
>          * If pitch is given per block, parse that first. Per-block pitches
>          * are encoded as an absolute value for the first block, and then
>          * delta values for all subsequent blocks. The scale of this value
>          * is semi-logarithmic compared to normal scale, so convert also.
>          */
>         if (frame_descs[bd_idx].acb_type == ACB_TYPE_HAMMING) {
>             int block_pitch,
>                 t1 = (s->block_conv_table[1] - s->block_conv_table[0]) << 2,
>                 t2 = (s->block_conv_table[2] - s->block_conv_table[1]) << 1,
>                 t3 =  s->block_conv_table[3] - s->block_conv_table[2] + 1;
> 
>             if (n == 0) {
>                 block_pitch = get_bits(gb, s->block_pitch_nbits);
>             } else
>                 block_pitch = last_block_pitch - s->block_delta_pitch_hrange +
>                                  get_bits(gb, s->block_delta_pitch_nbits);
>             /* Convert last_ so that any next delta is within _range */
>             last_block_pitch = av_clip(block_pitch,
>                                        s->block_delta_pitch_hrange,
>                                        s->block_pitch_range -
>                                            s->block_delta_pitch_hrange);
> 
>             /* Convert semi-log-style scale back to normal scale */
>             if (block_pitch < t1) {
>                 bl_pitch_sh2 = (s->block_conv_table[0] << 2) + block_pitch;
>             } else {
>                 block_pitch -= t1;
>                 if (block_pitch < t2) {
>                     bl_pitch_sh2 =
>                         (s->block_conv_table[1] << 2) + (block_pitch << 1);
>                 } else {
>                     block_pitch -= t2;
>                     if (block_pitch < t3) {
>                         bl_pitch_sh2 =
>                             (s->block_conv_table[2] + block_pitch) << 2;
>                     } else
>                         bl_pitch_sh2 = s->block_conv_table[3] << 2;
>                 }
>             }
>             pitch[n] = bl_pitch_sh2 >> 2;
>         }

And here you calculate it for ACB_TYPE_HAMMING. It is kind of inconsistent.

-Vitor