[FFmpeg-devel] [PATCH] WMA Voice decoder
Vitor Sessak
vitor1001
Tue Feb 9 19:21:31 CET 2010
Ronald S. Bultje wrote:
> Hi,
>
> some more:
>
> On Sat, Feb 6, 2010 at 11:19 AM, Vitor Sessak <vitor1001 at gmail.com> wrote:
>> Ronald S. Bultje wrote:
>>> +static void dequant_lsp10i(GetBitContext *gb, double *lsps)
>>> +{
>>> + static const uint16_t vec_sizes[4] = { 256, 64, 32, 32 };
>>> + static const double mul_lsf[4] = {
>>> + 5.2187144800e-3, 1.4626986422e-3,
>>> + 9.6179549166e-4, 1.1325736225e-3
>>> + };
>>> + static const double base_lsf[4] = {
>>> + M_PI * -2.15522e-1, M_PI * -6.1646e-2,
>>> + M_PI * -3.3486e-2, M_PI * -5.7408e-2
>>> + };
>>>
>> I think just putting the values multiplied by pi would be more readable.
>
> Reason I did it this way (here, at least) is that the values are
> actually exact (as far as I can see), so this is the most exact
> representation I can give of the number. Multiplied by pi, it'd always
> be rounded to something depending on how I type it.
>
> If that's a big thing, I'll change it, I just figured I'd explain the
> reason why I did it this way.
No big opinion on my part also, it is just that the compiler will
truncate it to double anyway.
>>> +#define NO_OFFSET -255
> [..]
>>> + if (n < 11)
>>> + memset(&off_table[n], -1, (11 - n) * sizeof(int));
>> -1 or NO_OFFSET?
>
> I changed the code so in this particular function, any value < 0 is
> invalid (later, I check if (table_off[x] >= 0), so it's the same. Only
> for WMAVoiceContext->aw_first_pulse_off[0-1], NO_OFFSET is a specific
> value (since negative offsets can be OK here).
I don't really like neither (negative values == unused or NO_OFFSET ==
unused), but in this case I see no obvious way of avoiding it.
>>> + s->aw_first_pulse_off[0] = s->aw_first_pulse_off[1] = NO_OFFSET;
>> Is this initialization really needed?
>
> Yes, since each of the two blocks can have no values at all (if
> aw_n_pulses[0-1] == 0). That happens occasionally. I could move it to
> the end under if (aw_n_pulses[] == 0), but thought that was ugly.
Yes, then it is better as it is now.
>>> + switch (frame_descs[bd_idx].acb_type) {
>>> + case ACB_TYPE_NONE:
>>> + memset(pitch, 0, sizeof(pitch[0]) *
>>> frame_descs[bd_idx].n_blocks);
>>>
>> If it is unused, is it really needed to waste time zero'ing it out?
>
> It prevents a valgrind warning because I specify it as an (in this
> case unused) function argument to synth_block(). One way around that,
> which you use in sipro as well, is to create a WMAFrameParams struct,
> but I didn't do that yet. That could also house stuff that doesn't
> need to be cached between frames but is currently in WMAVoiceContext,
> such as silence_gain. Shall I do that instead?
No, I like the code as it is now, but one thing:
> /*
> * Pitch (per ACB type):
> * - type 0: unused
> * - type 1: provided (globally) for the whole frame. In #synth_block(),
> * we derive the "pitch-per-sample" for adaptive codebook
> * reading.
> * - type 2: provided per block (see just before the call to
> * #synth_block()), so not read here.
> */
> switch (frame_descs[bd_idx].acb_type) {
> case ACB_TYPE_NONE:
> memset(pitch, 0, sizeof(pitch[0]) * frame_descs[bd_idx].n_blocks);
> break;
> case ACB_TYPE_ASYMMETRIC:
> n_blocks_x2 = frame_descs[bd_idx].n_blocks << 1;
> log_n_blocks_x2 = frame_descs[bd_idx].log_n_blocks + 1;
> cur_pitch_val = s->min_pitch_val + get_bits(gb, s->pitch_nbits);
> if (s->last_acb_type == ACB_TYPE_NONE ||
> 20 * abs(cur_pitch_val - s->last_pitch_val) >
> (cur_pitch_val + s->last_pitch_val))
> s->last_pitch_val = cur_pitch_val;
>
> /* pitch per frame/block */
> for (n = 0; n < frame_descs[bd_idx].n_blocks; n++) {
> int fac = n * 2 + 1;
>
> pitch[n] = (MUL16(fac, cur_pitch_val) +
> MUL16((n_blocks_x2 - fac), s->last_pitch_val) +
> frame_descs[bd_idx].n_blocks) >> log_n_blocks_x2;
> }
Here you calculate pitch[] for ACB_TYPE_NONE and ACB_TYPE_ASYMMETRIC
> for (n = 0; n < frame_descs[bd_idx].n_blocks; n++) {
> int bl_pitch_sh2 = pitch[n] << 2;
>
> /*
> * If pitch is given per block, parse that first. Per-block pitches
> * are encoded as an absolute value for the first block, and then
> * delta values for all subsequent blocks. The scale of this value
> * is semi-logarithmic compared to normal scale, so convert also.
> */
> if (frame_descs[bd_idx].acb_type == ACB_TYPE_HAMMING) {
> int block_pitch,
> t1 = (s->block_conv_table[1] - s->block_conv_table[0]) << 2,
> t2 = (s->block_conv_table[2] - s->block_conv_table[1]) << 1,
> t3 = s->block_conv_table[3] - s->block_conv_table[2] + 1;
>
> if (n == 0) {
> block_pitch = get_bits(gb, s->block_pitch_nbits);
> } else
> block_pitch = last_block_pitch - s->block_delta_pitch_hrange +
> get_bits(gb, s->block_delta_pitch_nbits);
> /* Convert last_ so that any next delta is within _range */
> last_block_pitch = av_clip(block_pitch,
> s->block_delta_pitch_hrange,
> s->block_pitch_range -
> s->block_delta_pitch_hrange);
>
> /* Convert semi-log-style scale back to normal scale */
> if (block_pitch < t1) {
> bl_pitch_sh2 = (s->block_conv_table[0] << 2) + block_pitch;
> } else {
> block_pitch -= t1;
> if (block_pitch < t2) {
> bl_pitch_sh2 =
> (s->block_conv_table[1] << 2) + (block_pitch << 1);
> } else {
> block_pitch -= t2;
> if (block_pitch < t3) {
> bl_pitch_sh2 =
> (s->block_conv_table[2] + block_pitch) << 2;
> } else
> bl_pitch_sh2 = s->block_conv_table[3] << 2;
> }
> }
> pitch[n] = bl_pitch_sh2 >> 2;
> }
And here you calculate it for ACB_TYPE_HAMMING. It is kind of inconsistent.
-Vitor
More information about the ffmpeg-devel
mailing list