[FFmpeg-devel] Review request - ra288.{c,h} ra144.{c,h}

Sun Sep 14 20:17:18 CEST 2008

Michael Niedermayer wrote:
> On Sun, Sep 14, 2008 at 05:55:16PM +0200, Vitor Sessak wrote:
>> Michael Niedermayer wrote:
>>> On Sat, Sep 13, 2008 at 09:48:46PM +0200, Vitor Sessak wrote:
>>>> Michael Niedermayer wrote:
>>>>> On Sat, Sep 13, 2008 at 07:07:26PM +0200, Vitor Sessak wrote:
>>>>>> Michael Niedermayer wrote:
>>>>>>> On Fri, Sep 05, 2008 at 12:23:58AM +0200, Vitor Sessak wrote:
>>>>>>>> Vitor Sessak wrote:
>>>>> [...]
>>>>>>> [...]
>>>>>>>> static void colmult(float *tgt, const float *m1, const float *m2, int n)
>>>>>>>> {
>>>>>>>>     while (n--)
>>>>>>>>         *tgt++ = *m1++ * *m2++;
>>>>>>>> }
>>>>>>> such function is commonly called apply_window() in other codecs
>>>>>>>> static void decode(RA288Context *ractx, float gain, int cb_coef)
>>>>>>>> {
>>>>>>>>     int i, j;
>>>>>>>>     double sumsum;
>>>>>>>>     float sum, buffer[5];
>>>>>>>>     float *block = ractx->sp_block + 36; // Current block
>>>>>>>>
>>>>>>>>     memmove(ractx->sp_block, ractx->sp_block + 5, 
>>>>>>>> 36*sizeof(*ractx->sp_block));
>>>>>>>>
>>>>>>>>     for (i=0; i < 5; i++) {
>>>>>>>>         block[i] = 0.;
>>>>>>>>         for (j=0; j < 36; j++)
>>>>>>>>             block[i] -= block[i-1-j]*ractx->sp_lpc[j];
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     /* block 46 of G.728 spec */
>>>>>>>>     sum = 32.;
>>>>>>>>     for (i=0; i < 10; i++)
>>>>>>>>         sum -= ractx->gain_block[9-i] * ractx->gain_lpc[i];
>>>>>>>>
>>>>>>>>     /* block 47 of G.728 spec */
>>>>>>>>     sum = av_clipf(sum, 0, 60);
>>>>>>>>
>>>>>>>>     /* block 48 of G.728 spec */
>>>>>>>>     sumsum = exp(sum * 0.1151292546497) * gain; /* pow(10.0,sum/20)*gain 
>>>>>>>> */
>>>>>>>>
>>>>>>>>     for (i=0; i < 5; i++)
>>>>>>>>         buffer[i] = codetable[cb_coef][i] * sumsum;
>>>>>>>>
>>>>>>>>     sum = scalar_product_float(buffer, buffer, 5) / 5;
>>>>>>>>
>>>>>>>>     sum = FFMAX(sum, 1);
>>>>>>>>
>>>>>>>>     /* shift and store */
>>>>>>>>     memmove(ractx->gain_block, ractx->gain_block + 1,
>>>>>>>>             9 * sizeof(*ractx->gain_block));
>>>>>>>>
>>>>>>>>     ractx->gain_block[9] = 10 * log10(sum) - 32;
>>>>>>>>
>>>>>>>>     for (i=1; i < 5; i++)
>>>>>>>>         for (j=i-1; j >= 0; j--)
>>>>>>>>             buffer[i] -= ractx->sp_lpc[i-j-1] * buffer[j];
>>>>>>>>
>>>>>>>>     /* output */
>>>>>>>>     for (i=0; i < 5; i++)
>>>>>>>>         block[i] = av_clipf(block[i] + buffer[i], -4095, 4095);
>>>>>>> can the buffer values be stored in block and sp_lpc applied over both
>>>>>>> in one pass instead of this 2 pass and add-clip thing?
>>>>>> I can't apply sp_lpc to buffer+block, so I need two buffers...
>>>>> What i was thinking about was:
>>>>>
>>>>>     /* block 46 of G.728 spec */
>>>>>     sum = 32.;
>>>>>     for (i=0; i < 10; i++)
>>>>>         sum -= gain_block[9-i] * ractx->gain_lpc[i];
>>>>>
>>>>>     /* block 47 of G.728 spec */
>>>>>     sum = av_clipf(sum, 0, 60);
>>>>>
>>>>>     /* block 48 of G.728 spec */
>>>>>     sumsum = exp(sum * 0.1151292546497) * gain; /* pow(10.0,sum/20)*gain */
>>>>>
>>>>>     for (i=0; i < 5; i++)
>>>>>         buffer[i] = codetable[cb_coef][i] * sumsum * (1./2048.);
>>>>>
>>>>>     sum = scalar_product_float(buffer, buffer, 5) / 5;
>>>>>
>>>>>     sum = FFMAX(sum, 1);
>>>>>
>>>>>     /* shift and store */
>>>>>     memmove(gain_block, gain_block + 1, 9 * sizeof(*gain_block));
>>>>>
>>>>>     gain_block[9] = 10 * log10(sum) - 32;
>>>>>
>>>>>     for (i=0; i < 5; i++) {
>>>>>         block[i] = buffer[i];
>>>> Here you are overwriting the value of block[i] (while previous code used 
>>>> this value).
>>> previous code did:
>>> for (i=0; i < 5; i++) {
>>>     block[i] = 0.;
>>>
>>> so that certainly was not useing it
>> Ok, the problem is the following. Now we have (moving down a few loops):
>>
>>      for (i=1; i < 5; i++)
>>          for (j=i-1; j >= 0; j--)
>>              buffer[i] -= ractx->sp_lpc[i-j-1] * buffer[j];
>>
>>      for (i=0; i < 5; i++) {
>>          block[i] = 0.;
>>          for (j=0; j < 36; j++)
>>              block[i] -= block[i-1-j]*ractx->sp_lpc[j];
>>      }
>>
>>      /* output */
>>      for (i=0; i < 5; i++)
>>          block[i] = av_clipf(block[i] + buffer[i], -4095, 4095);
>>
>> And I cannot change block[i] = 0. to block[i] = buffer[i] because 
>> block[0] will be read for evaluating block[1] in the second loop.
> 
> no and yes
> 
> the second loop applies a LPC filter to block assuming the future samples
> are 0
> the first loop applies the same LPC filter to buffer assuming the previous
> samples are 0
> the third loop combines them so that they match as if a LPC filter had
> been applied to their sum i think.
> 
> 
> 
>>>>>>> [...]
>>>>>>>> static int ra288_decode_frame(AVCodecContext * avctx, void *data,
>>>>>>>>                               int *data_size, const uint8_t * buf,
>>>>>>>>                               int buf_size)
>>>>>>>> {
>>>>>>>>     int16_t *out = data;
>>>>>>>>     int i, j;
>>>>>>>>     RA288Context *ractx = avctx->priv_data;
>>>>>>>>     GetBitContext gb;
>>>>>>>>
>>>>>>>>     if (buf_size < avctx->block_align) {
>>>>>>>>         av_log(avctx, AV_LOG_ERROR,
>>>>>>>>                "Error! Input buffer is too small [%d<%d]\n",
>>>>>>>>                buf_size, avctx->block_align);
>>>>>>>>         return 0;
>>>>>>>>     }
>>>>>>>>
>>>>>>>>     if (*data_size < 32*5*2)
>>>>>>>>         return -1;
>>>>>>>>
>>>>>>>>     init_get_bits(&gb, buf, avctx->block_align * 8);
>>>>>>>>
>>>>>>>>     for (i=0; i < 32; i++) {
>>>>>>>>         float gain = amptable[get_bits(&gb, 3)];
>>>>>>>>         int cb_coef = get_bits(&gb, 6 + (i&1));
>>>>>>>>
>>>>>>>>         decode(ractx, gain, cb_coef);
>>>>>>>>
>>>>>>>>         for (j=0; j < 5; j++)
>>>>>>>>             *(out++) = 8 * ractx->sp_block[36 + j];
>>>>>>> if float output works already, then this could output floats, if not then
>>>>>>> this could use lrintf()
>>>>>> I've tried the float output (with the attached patch) and it didn't work. 
>>>>> ok
>>>>>
>>>>>
>>>>>> Using lrint() changes slightly the output (PSNR about 99), is it expected?
>>>>> yes, it does round differently (=more correctly)
>>>> Too correct maybe. PSNR to binary decoder with SVN:
>>>>
>>>> stddev:    0.15 PSNR:112.70 bytes:   990720/  1013760
>>>> stddev:    0.04 PSNR:122.74 bytes:   368640/   368640
>>>> stddev:    0.07 PSNR:118.84 bytes:   460800/   458752
>>>> stddev:    0.31 PSNR:106.24 bytes:  6451200/  6451200
>>>>
>>>> Using lrint()
>>>>
>>>> stddev:    0.70 PSNR: 99.33 bytes:   990720/  1013760
>>>> stddev:    0.70 PSNR: 99.35 bytes:   368640/   368640
>>>> stddev:    0.70 PSNR: 99.35 bytes:   460800/   458752
>>>> stddev:    0.75 PSNR: 98.76 bytes:  6451200/  6451200
>>> yes, the rounding is more accurate, and differs by +-1 50% of the time from
>>> the binary decoder, sqrt(0.5) ~ 0.7
>>>
>>> If you want a proof that it is better, you should compare the original
>>> pcm that is
>>>
>>> X -> encoder -> binary decoder -> Y
>>>              -> FF decoder ->Z
>>>
>>> and look at how the X-Y and X-Z change relative to each other.
>>>
>>> Also you would see a similar PSNR change relative to the binary decoder if
>>> you would output floats.
>> I've already tried comparing PSNR to the original input when I was 
>> looking for a way to test float codecs in FATE.
>>
>> vitor at vitor$ ffmpeg -i luckynightmono2.ra -ac 1 -ar 8000 test.wav
>> vitor at vitor$ ffmpeg -i luckynight.wav -ac 1 -ar 8000 test2.wav
>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 0 44
>> stddev: 5981.39 PSNR: 20.78 bytes:   990720/   967662
>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 2 44
>> stddev: 5982.77 PSNR: 20.78 bytes:   990718/   967662
>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 100 44
>> stddev: 6012.76 PSNR: 20.74 bytes:   990620/   967662
>>
>> And by looking at results, if I change the "skip bytes" parameter I 
>> don't get much change in PSNR. For me, this is a signal that the value I 
>> got is meaningless (since it don't change a lot if I compare it with 
>> different data). I asked about it in IRC and people told me that PSNR 
>> didn't worked very well to LPC vocoders. Sample in 
>> http://samples.mplayerhq.hu/real/AC-28_8/ .
> 
> considering that the claimed encoder input has
> 10668716 bytes of 44.1khz at stereo
> and that /2/44100*8000 is ~967684
> and the ra288 decoder output has 990764 bytes i cant help but wonder
> why, but of course this is incompareable. PSNR or otherwise

Yes, the files have different sizes. That's why I started poking with 
"skip bytes" and tried to cut the files. But I didn't succeeded in 
making they match whatever I did.

-Vitor