[FFmpeg-devel] Review request - ra288.{c,h} ra144.{c,h}

Vitor Sessak vitor1001
Wed Sep 17 21:26:40 CEST 2008


Michael Niedermayer wrote:
> On Tue, Sep 16, 2008 at 08:23:19PM +0200, Vitor Sessak wrote:
>> Michael Niedermayer wrote:
>>> On Mon, Sep 15, 2008 at 07:49:41PM +0200, Vitor Sessak wrote:
>>>> Michael Niedermayer wrote:
>>>>> On Sun, Sep 14, 2008 at 11:29:08PM +0200, Vitor Sessak wrote:
>>>>>> Michael Niedermayer wrote:
>>>>>>> On Sun, Sep 14, 2008 at 08:17:18PM +0200, Vitor Sessak wrote:
>>>>>>>> Michael Niedermayer wrote:
>>>>>>>>> On Sun, Sep 14, 2008 at 05:55:16PM +0200, Vitor Sessak wrote:
>>>>>>> [...]
>>>>>>>>>>>>>>> [...]
>>>>>>>>>>>>>>>> static int ra288_decode_frame(AVCodecContext * avctx, void *data,
>>>>>>>>>>>>>>>>                               int *data_size, const uint8_t * buf,
>>>>>>>>>>>>>>>>                               int buf_size)
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>     int16_t *out = data;
>>>>>>>>>>>>>>>>     int i, j;
>>>>>>>>>>>>>>>>     RA288Context *ractx = avctx->priv_data;
>>>>>>>>>>>>>>>>     GetBitContext gb;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     if (buf_size < avctx->block_align) {
>>>>>>>>>>>>>>>>         av_log(avctx, AV_LOG_ERROR,
>>>>>>>>>>>>>>>>                "Error! Input buffer is too small [%d<%d]\n",
>>>>>>>>>>>>>>>>                buf_size, avctx->block_align);
>>>>>>>>>>>>>>>>         return 0;
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     if (*data_size < 32*5*2)
>>>>>>>>>>>>>>>>         return -1;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     init_get_bits(&gb, buf, avctx->block_align * 8);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     for (i=0; i < 32; i++) {
>>>>>>>>>>>>>>>>         float gain = amptable[get_bits(&gb, 3)];
>>>>>>>>>>>>>>>>         int cb_coef = get_bits(&gb, 6 + (i&1));
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         decode(ractx, gain, cb_coef);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         for (j=0; j < 5; j++)
>>>>>>>>>>>>>>>>             *(out++) = 8 * ractx->sp_block[36 + j];
>>>>>>>>>>>>>>> if float output works already, then this could output floats, if not then
>>>>>>>>>>>>>>> this could use lrintf()
>>>>>>>>>>>>>> I've tried the float output (with the attached patch) and it didn't work. 
>>>>>>>>>>>>> ok
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Using lrint() changes slightly the output (PSNR about 99), is it expected?
>>>>>>>>>>>>> yes, it does round differently (=more correctly)
>>>>>>>>>>>> Too correct maybe. PSNR to binary decoder with SVN:
>>>>>>>>>>>>
>>>>>>>>>>>> stddev:    0.15 PSNR:112.70 bytes:   990720/  1013760
>>>>>>>>>>>> stddev:    0.04 PSNR:122.74 bytes:   368640/   368640
>>>>>>>>>>>> stddev:    0.07 PSNR:118.84 bytes:   460800/   458752
>>>>>>>>>>>> stddev:    0.31 PSNR:106.24 bytes:  6451200/  6451200
>>>>>>>>>>>>
>>>>>>>>>>>> Using lrint()
>>>>>>>>>>>>
>>>>>>>>>>>> stddev:    0.70 PSNR: 99.33 bytes:   990720/  1013760
>>>>>>>>>>>> stddev:    0.70 PSNR: 99.35 bytes:   368640/   368640
>>>>>>>>>>>> stddev:    0.70 PSNR: 99.35 bytes:   460800/   458752
>>>>>>>>>>>> stddev:    0.75 PSNR: 98.76 bytes:  6451200/  6451200
>>>>>>>>>>> yes, the rounding is more accurate, and differs by +-1 50% of the time from
>>>>>>>>>>> the binary decoder, sqrt(0.5) ~ 0.7
>>>>>>>>>>>
>>>>>>>>>>> If you want a proof that it is better, you should compare the original
>>>>>>>>>>> pcm that is
>>>>>>>>>>>
>>>>>>>>>>> X -> encoder -> binary decoder -> Y
>>>>>>>>>>>              -> FF decoder ->Z
>>>>>>>>>>>
>>>>>>>>>>> and look at how the X-Y and X-Z change relative to each other.
>>>>>>>>>>>
>>>>>>>>>>> Also you would see a similar PSNR change relative to the binary decoder if
>>>>>>>>>>> you would output floats.
>>>>>>>>>> I've already tried comparing PSNR to the original input when I was 
>>>>>>>>>> looking for a way to test float codecs in FATE.
>>>>>>>>>>
>>>>>>>>>> vitor at vitor$ ffmpeg -i luckynightmono2.ra -ac 1 -ar 8000 test.wav
>>>>>>>>>> vitor at vitor$ ffmpeg -i luckynight.wav -ac 1 -ar 8000 test2.wav
>>>>>>>>>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 0 44
>>>>>>>>>> stddev: 5981.39 PSNR: 20.78 bytes:   990720/   967662
>>>>>>>>>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 2 44
>>>>>>>>>> stddev: 5982.77 PSNR: 20.78 bytes:   990718/   967662
>>>>>>>>>> vitor at vitor$ tiny_psnr test.wav test2.wav 2 100 44
>>>>>>>>>> stddev: 6012.76 PSNR: 20.74 bytes:   990620/   967662
>>>>>>>>>>
>>>>>>>>>> And by looking at results, if I change the "skip bytes" parameter I 
>>>>>>>>>> don't get much change in PSNR. For me, this is a signal that the value I 
>>>>>>>>>> got is meaningless (since it don't change a lot if I compare it with 
>>>>>>>>>> different data). I asked about it in IRC and people told me that PSNR 
>>>>>>>>>> didn't worked very well to LPC vocoders. Sample in 
>>>>>>>>>> http://samples.mplayerhq.hu/real/AC-28_8/ .
>>>>>>>>> considering that the claimed encoder input has
>>>>>>>>> 10668716 bytes of 44.1khz at stereo
>>>>>>>>> and that /2/44100*8000 is ~967684
>>>>>>>>> and the ra288 decoder output has 990764 bytes i cant help but wonder
>>>>>>>>> why, but of course this is incompareable. PSNR or otherwise
>>>>>>>> Yes, the files have different sizes. That's why I started poking with 
>>>>>>>> "skip bytes" and tried to cut the files. But I didn't succeeded in 
>>>>>>>> making they match whatever I did.
>>>>>>> how has the .ra file been generated?
>>>>>>> what happens with a 2x as long input file? does the size difference
>>>>>>> stay constant or grow?
>>>>>>>
>>>>>>> what does the binary decoder produce for it? is that also too big?
>>>>>> Original     wav:  967706 bytes
>>>>>> FFmpeg   decoder:  990764 bytes
>>>>>> Original decoder: 1013804 bytes
>>>>>>
>>>>>> Go figure...
>>>>> the decoder outputs 3 seconds more than what is in the claimed original.
>>>>> How does it sound? is the audio stretched to the bigger length are there
>>>>> 3 seconds of distortion or silence somewhere?
>>>> Original     wav:  967706 bytes
>>>> FFmpeg   decoder:  990764 bytes   1 second  of silence in the end
>>>> Original decoder: 1013804 bytes   3 seconds of silence in the end
>>>>
>>>> Anyway, nothing of that explains the PSNR discrepancy...
>>> ok, so lets forget about the PSNR, and rather try a simpler test for
>>> the accuracy, just try to cast a float to an int and try lrintf()
>>> and print the differens, or sum or squared differences, it should be
>>> obvious which is more accurate.
>> The problem is not just when using PSNR, but I also fail to see any 
>> similarity between the two files in a hex editor.
> 
> PEBCAK
> 
> looking with gnuplot at the files shows clearly that they are near identical
> but drift, creating one with -ar 8018 and using
> tiny_psnr luckymonora.wav luckymono2.wav 2 34 44
> shows:
> stddev: 3474.96 PSNR: 25.50 bytes:   990686/   969838
> 
> which cuts the difference to half of what you had, but thats not how
> one compares files.
> We need a  8khz wav and the corresponding ra288, a 44khz that was "somehow,
> we dont know" resampled to 8khz is not useable.

Ok, 10l for me. If I resample before converting to ra, I get much better 
results:

stddev:  644.58 PSNR: 40.13 bytes:   967662/   967680

and with lrintf()

stddev:  644.70 PSNR: 40.13 bytes:   967662/   967680

Which I'd say is worse, but of a negligible amount. I've also finally 
found out how to make SAMPLE_FMT_FLT work (which gives me the same 
output as lrintf()), so I changed to that.

-Vitor




More information about the ffmpeg-devel mailing list