[FFmpeg-devel] [PATCH] G.729 Frame erasure support for fixed-codebook vector decoding

Vladimir Voroshilov voroshil
Sun Jun 28 06:34:51 CEST 2009


2009/6/28 Michael Niedermayer <michaelni at gmx.at>:
> On Sun, Jun 28, 2009 at 02:01:54AM +0700, Vladimir Voroshilov wrote:
>> 2009/6/28 Michael Niedermayer <michaelni at gmx.at>:
>> > On Sat, Jun 27, 2009 at 09:43:27AM +0700, Vladimir Voroshilov wrote:
>> >> 2009/6/27 Michael Niedermayer <michaelni at gmx.at>:
>> >> > On Sat, Jun 27, 2009 at 02:31:40AM +0700, Vladimir Voroshilov wrote:
>> >> >> subj
>> >> > [...]
>> >> >> @@ -224,6 +225,9 @@ static av_cold int decoder_init(AVCodecContext * avctx)
>> >> >> ? ? ?ctx->lsp[1] = ctx->lsp_buf[1];
>> >> >> ? ? ?memcpy(ctx->lsp[0], lsp_init, 10 * sizeof(int16_t));
>> >> >>
>> >> >> + ? ?/* random seed initialization */
>> >> >> + ? ?ctx->rand_value = 21845;
>> >> >
>> >> > the word pointless comes to mind, at least if this is just used for
>> >> > frame erasure handling
>> >> >
>> >> >
>> >> >> +
>> >> >> ? ? ?return 0;
>> >> >> ?}
>> >> >>
>> >> >> @@ -336,6 +340,15 @@ static int decode_frame(AVCodecContext *avctx, void *data, int *data_size,
>> >> >> ? ? ? ? ?/* Round pitch delay to nearest (used everywhere except ff_acelp_interpolate). */
>> >> >> ? ? ? ? ?pitch_delay_int ?= (pitch_delay_3x + 1) / 3;
>> >> >>
>> >> >> + ? ? ? ?if (frame_erasure) {
>> >> >> + ? ? ? ? ? ?ctx->rand_value = g729_prng(ctx->rand_value);
>> >> >> + ? ? ? ? ? ?fc_indexes ? = ctx->rand_value & ((1 << format.fc_indexes_bits) - 1);
>> >> >> +
>> >> >> + ? ? ? ? ? ?ctx->rand_value = g729_prng(ctx->rand_value);
>> >> >> + ? ? ? ? ? ?pulses_signs = ctx->rand_value & ((1 << format.fc_signs_bits) - 1);
>> >> >> + ? ? ? ?}
>> >> >
>> >> > a single call to the prng is probably enough also is the & really needed?
>> >>
>> >> Well. Then i have to ask (you and other devs): "Does FFmpeg is
>> >> interested in having G.729 decoder
>> >> binary exact with reference one" (this means, capable of passing all
>> >> ITU's tests) .
>> >
>> > we need a decoder that conforms to all mandatory parts of the spec.
>> > specs generally contain mandatory and non mandatory parts
>> > and tests that are designed to test a single specific implementation instead
>> > of testing the conformance to a spec, that is tests that say, well on
>> > x86 it returned that so it should on other platforms are not usefull to
>> > us.
>>
>> ITU's code works not only on x86.
>> I'm not sure that understand this long sentence correctly, though.
>>
>> > It would be silly to be limited by more than what one MUST follow or SHOULD
>> > follow, simply copy and pasting the ITU implementation is not my goal.
>>
>> 1. i didn't find any separation to mandatory and non mandatory parts
>> in G.729 spec.
>> In this case i prefer treating all spec as mandatory.
>
> so you plan to implement a decoder that is bitidentical to 2 decoders that
> are not bitidentical to each other?
> theres a float and a fixed point decoder ...

Under "all spec" i meant main G.729 body (without annexes).
By the way, ITU has separate test vectors for fixed-point
implementation of main g.729 and any annex.
Floating-point code is not checked by vectors for obvious reason.

>> >> Replacing pseudo-random generator with FFmpeg's internal one or
>> >> merging above two lines, definitely breaks
>> >> passing "erasure" ITU test.
>> >
>> > which AFAIK is a test for ITUs own implementation not the ITU spec, but
>> > maybe i missremember, in which case please correct me ...
>> >
>> >
>> >>
>> >> Note also, that if i break bitexactness, i'll never have enough good
>> >> test for checking introduced bugs.
>> >
>>
>> > it seems we can test other decoders too without bitexactness, i assume
>> > you have ears and can hear if things are equal or not, besides bitexactness
>> > stuff could maybe be kept under #ifdef
>>
>> Internal "make test" uses sample->encoder->decoder->sample2 process.
>> Which is not applicapable without encoder. And we can't hope that
>> "sample" will not differ "sample2" ?if decoder does not follow some
>> parts of spec.
>
> decoder development generally involves the developer working on the decoder
> to perform whichever tests are neccessary to ensure that the changes are
> correct. That may sometimes match the regression test, though as you note
> we have no encoder so no reg test. We have many other decoders that lack
> reg tests still they are being tested during development

This is the reason why i'm afraid to break bitexactness.
Currently, the only one test i have is ITU's test vectors set.
And i don't  know how to check is some rewriting cause change of
result or it  introduced a bug.
My hears are not good tester fo this. E.g. for sipr they said "all is
ok" while Vitor Sessak found several bugs
in  code by comparing similar parts of code between each other.

>> > the only question is if the g729 spec requires bitexact handling of damaged
>> > frames or not. if not theres no need for us to follow the g729 reference
>> > implementation
>>
>> As i already said, i didn't find any separation to required and
>> non-required parts.
>> And i prefer to keep bitexactness everywhere where possible and if this is not
>> significantly hurt performance and code readability.
>>
>> P.S. If i'm not wrong we are currently fighting for two
>> multiplications per 10ms and one extra
>> assignment in decoder initialization. Other occurance of frame erasure code is
>> more comples, though.
>
> ffmpegs native de and encoders are supposed to share code where possibe and
> there are things that are similar to g729. Now if bitexactness is required
> thats fine but otherwise its a constraint thats not going to help readability.
> Now considering that there is a reference float implementation that is not
> bitidentical to the fixed point it does not seem the spec requires it.
>
> To write a decoder for ffmpeg, the author needs to understand the algorithms
> used in it at least to some extend. Writing a decoder for ffmpeg is about
> implementing the stuff writen in a spec cleanly&efficiently, its not about
> writing a 1:1 translation of another implementation. Iam not particularely
> fond of the NIH syndrome, we can just use the reference in that case.

Then, i'm afraid, you should find another author for FFmpeg's g.729, since i'm
understanding (in base case) only  about 30-50% of algorythm (in meaning "why
it does this or those thing").

> And id like to clarify that "its not that bad" style comments are not an
> excuse or argument. We arent littering our code by x++; x--; because its
> not that bad at that point.

Even if after such "x++" you'll get bitexact result with reference ITU code,
while all regression test are still ok, no duplication introduced, performance
is not affected and so on?

If any of such restrictions became broken, then "x++" will be rejected, this is
obvious even for me.

> Also to say it in case i didnt yet, iam not arguing for any specific thing
> or even for ignooring bitexactness i just think the arguments you use to
> make design decissions are the wrong ones. I mean like
> "lets keep it bitexact because its easier" vs. "ive tested your suggestions
> and they sound worse"

Nope. "it easy to track introduced bugs at this stage".
Also I don't understand what should i treat as "worse".
Breaking bitexactnes is worse for me but not for you.

I'm testing each of your suggestion and trying to argue  why i don't like it,
instead of simple "it is worse" (at least i'm sure i do). I know that you
never asks to change code without a reason.

And as i already said, hearing the result is unreliable test for me.
I'm not musician with perfect hearing. I'm not able to find exact
place in code containing
a bug after spotting one extra "click" in decoded sound.

P.S. looks like discussing patch converts into long flaming :(
I'm sorry for this.

Returing to the first you comment:


>> @@ -224,6 +225,9 @@ static av_cold int decoder_init(AVCodecContext * avctx)
>>      ctx->lsp[1] = ctx->lsp_buf[1];
>>      memcpy(ctx->lsp[0], lsp_init, 10 * sizeof(int16_t));
>>
>> +    /* random seed initialization */
>> +    ctx->rand_value = 21845;

>the word pointless comes to mind, at least if this is just used for
>frame erasure handling

>> +
>>      return 0;
>>  }
>>
>> @@ -336,6 +340,15 @@ static int decode_frame(AVCodecContext *avctx, void *data, int *data_size,
>>          /* Round pitch delay to nearest (used everywhere except ff_acelp_interpolate). */
>>          pitch_delay_int  = (pitch_delay_3x + 1) / 3;
>>
>> +        if (frame_erasure) {
>> +            ctx->rand_value = g729_prng(ctx->rand_value);
>> +            fc_indexes   = ctx->rand_value & ((1 << format.fc_indexes_bits) - 1);
>> +
>> +            ctx->rand_value = g729_prng(ctx->rand_value);
>> +            pulses_signs = ctx->rand_value & ((1 << format.fc_signs_bits) - 1);
>> +        }

> a single call to the prng is probably enough also is the & really needed?

As i said before, seed initialization as long as double call to
g729_prng and "&" are defined
in main body of G.729, part 4.4.4 "Generation of the replacement
excitation". Is this
enough reason  to keep them ?


-- 
Regards,
Vladimir Voroshilov     mailto:voroshil at gmail.com
JID: voroshil at gmail.com, voroshil at jabber.ru
ICQ: 95587719



More information about the ffmpeg-devel mailing list