[FFmpeg-devel] [PATCH] g722 decoder, no licensing fame

Fri Apr 10 05:22:48 CEST 2009

On Thu, Apr 09, 2009 at 08:14:58AM -0700, Kenan Gillet wrote:
> On Thu, Apr 9, 2009 at 12:12 AM, Diego Biurrun <diego at biurrun.de> wrote:
> > On Wed, Apr 08, 2009 at 09:23:51PM -0700, Kenan Gillet wrote:
> >>
> >> here is a version 3 of the patch:
> >> - it includes the Diego's suggested changes
> >
> > It does not.
> 
> darn, i double checked the changes, and just ended up attaching the
> wrong file :(
> 
> anyway correct patch attached,
[...]
> + * @note For the 56000bps and 48000bps bitrates, the respectively 7 and 6 bits
> + *       codeword might be packed, so unpacking might be needed either
> + *       internally or as a separate parser.

do non packed cases exist anywhere in the wild?
do packed cases exist anywhere in the wild?

it seems to me that filling 30% of the bits with 0 is not particlarely
likely in a real world environment

> + */
> +
> +#include <stdint.h>
> +#include "avcodec.h"
> +#include "mathops.h"
> +
> +typedef struct {
> +    int bits_per_sample;             ///< 6 for 48000kbps, 7 for 56000kbps, or 8 for 64000kbps
> +
> +    int16_t prev_samples[24];        ///< memory of past 24 received (decoded) samples
> +
> +    /**
> +     * The band[0] and band[1] correspond respectively to the lower band and higher band.
> +     */
> +    struct G722Band {

> +        int16_t s_predictor;         ///< predictor output value

ok that makes sense

> +        int32_t s_zero;              ///< zero section output signal

what is that?

> +        int8_t  part_reconst_mem[2]; ///< partially reconstructed signal memory

and that?

> +        int16_t prev_qtzd_reconst;   ///< previous quantized reconstructed signal

this is not true, this uses a different dequantization table in some cases
i even wonder if this is a bug

> +        int16_t pole_mem[2];         ///< second-order pole section coefficient buffer
> +        int32_t diff_mem[6];         ///< quantizer difference signal memory
> +        int16_t zero_mem[6];         ///< Seventh-order zero section coefficient buffer

> +        int16_t log_factor;          ///< delayed logarithmic quantizer factor

log what? log_e log_2 log_10 ?

> +        int16_t scale_factor;        ///< delayed quantizer scale factor
> +    } band[2];
> +} G722Context;
> +
> +
> +static const int8_t sign_lookup[2] = { -1, 1 };
> +
> +static const int16_t ilb[32] = {
> +  2048, 2093, 2139, 2186, 2233, 2282, 2332, 2383,
> +  2435, 2489, 2543, 2599, 2656, 2714, 2774, 2834,
> +  2896, 2960, 3025, 3091, 3158, 3228, 3298, 3371,
> +  3444, 3520, 3597, 3676, 3756, 3838, 3922, 4008
> +};

> +static const int16_t wh[2]   = { 798, -214 };

2 letters is better than 1 ... well sometimes its not much better though

> +static const int16_t qm2[4]  = { -7408, -1616,  7408,   1616 };

> +/**
> + * qm3[index] == wl[rl42[index]]
> + */
> +static const int16_t qm3[16] = {
> +   -60, 3042, 1198, 538, 334, 172,  58, -30,
> +  3042, 1198,  538, 334, 172,  58, -30, -60
> +};
> +static const int16_t qm4[16] = {
> +      0, -20456, -12896,  -8968, -6288,  -4240,  -2584,  -1200,
> +  20456,  12896,   8968,  6288,   4240,   2584,   1200,      0
> +};

what is qm?
i know these are dequantization tables, and at that poorly designed ones
(yeah the duplicate entries)
but i cant relate "qm" to "dequantization"

> +
> +/**
> + * quadrature mirror filter (QMF) coefficients
> + *
> + * ITU-T G.722 Table 11
> + */
> +static const int16_t qmf_coeffs[12] = {
> +  3, -11, 12, 32, -210, 951, 3876, -805, 362, -156, 53, -11,
> +};
> +
> +

> +/**
> + * adaptive predictor
> + *
> + * @note On x86 using the MULL macro in a loop is slower than not using the macro.
> + */
> +static void do_adaptive_prediction(struct G722Band *band, const int cur_diff)

missing doxy for cur_diff

[...]
> +static int inline scale(const int log_factor, int shift)
> +{
> +    const int wd1 = ilb[(log_factor >> 6) & 31];
> +    shift -= log_factor >> 11;
> +    return (shift < 0 ? wd1 << -shift : wd1 >> shift) << 2;
> +}

i belive we already have some integer exp2() also this one lacks
documentation and a name that is related to what it does if there is
a reason why common code cannot be used

[...]
> +static void apply_qmf(int16_t *prev_samples, int *xout1, int *xout2)
> +{
> +    int i;
> +
> +    *xout1 = 0;
> +    *xout2 = 0;
> +    for (i = 0;  i < 12;  i++) {
> +        MAC16(*xout2, prev_samples[2*i  ], qmf_coeffs[i   ]);
> +        MAC16(*xout1, prev_samples[2*i+1], qmf_coeffs[11-i]);
> +    }

> +    memmove(prev_samples, prev_samples + 2, 22*sizeof(prev_samples[0]));

please find a solution that does not need to move the array by 2 samples
after each 2 samples.

[...]
> +static int g722_decode_frame(AVCodecContext *avctx, void *data, int *data_size, AVPacket *avpkt)
> +{
> +    G722Context *c = avctx->priv_data;
> +    int16_t *out_buf = data;
> +    const uint8_t *buf = avpkt->data;
> +    int j, out_len = 0;
> +    const int shift = 8 - c->bits_per_sample;
> +    const int16_t *quantizer_table = qms[shift];
> +
> +    for (j = 0;  j < avpkt->size; j++) {
> +        const int ilow = buf[j] & (0x3F >> shift);
> +        const int rlow = av_clip(MULL(c->band[0].scale_factor, quantizer_table[ilow], 15) +
> +                                 c->band[0].s_predictor, -16384, 16383);
> +
> +        update_low_predictor(&c->band[0], ilow >> (2 - shift));
> +
> +        if (avctx->sample_rate == 16000) {
> +            const int ihigh = (buf[j] >> (6 - shift)) &  0x03;

all that looks a litte obfuscated, get_bits() seems like the proper choice

[...]
> +static int g722_encode_frame(AVCodecContext *avctx,
> +                             uint8_t *dst, int buf_size, void *data)
> +{
> +    G722Context *c = avctx->priv_data;
> +    const int16_t *samples = data;
> +
> +    int diff, limit;
> +    int i, j;
> +
> +    for (j = 0;  j < buf_size; ) {
> +        int xlow,  rlow;
> +        if (avctx->sample_rate == 16000) {
> +            int xout1, xout2, xhigh, diff, pred, index;
> +
> +            c->prev_samples[22] = samples[j++];
> +            c->prev_samples[23] = samples[j++];
> +            apply_qmf(c->prev_samples, &xout1, &xout2);
> +            xlow  = (xout1 + xout2) >> 14;
> +            xhigh = (xout1 - xout2) >> 14;
> +
> +            diff = av_clip_int16(xhigh - c->band[1].s_predictor);
> +            pred = MULL(564, c->band[1].scale_factor, 12);
> +            index = diff >= 0 ? (diff  <  pred) + 2
> +                              :  diff >= -pred;
> +
> +            update_high_predictor(&c->band[1], MULL(c->band[1].scale_factor, qm2[index], 15), index);
> +
> +            *dst = (index << 6);
> +        } else
> +            xlow = samples[j++];
> +
> +        diff = av_clip_int16(xlow - c->band[0].s_predictor);
> +        limit = diff >= 0 ? diff : -(diff + 1);
> +        for (i = 0;  i < 29 && limit >= (q6[i]*c->band[0].scale_factor) >> 12;  i++)
> +            ;
> +        *dst |=
> +        rlow  = (diff < 0 ? (i < 2 ? 63 : 33)  :  61) - i;

i have good news and bad news
the good are that quality might be significatly improvable by using
trellis/viterbi, the bad news is that the encoder is rejected unless
you either implement that or proof that it cant be done 
(and i dont mean full viterbi but keeping more than 1 possible
previous bitstream that is not just 1 previous and the best current
quantized value but several best past ones X several current quantized
values ...)

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Concerning the gods, I have no means of knowing whether they exist or not
or of what sort they may be, because of the obscurity of the subject, and
the brevity of human life -- Protagoras
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090410/97714d09/attachment.pgp>