[FFmpeg-devel] [PATCH]Lagarith decoder.
Michael Niedermayer
michaelni
Sat Oct 24 02:38:49 CEST 2009
On Thu, Oct 15, 2009 at 08:42:54PM -0600, Nathan Caldwell wrote:
> On Tue, Oct 13, 2009 at 3:28 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Tue, Sep 22, 2009 at 11:54:03PM -0600, Nathan Caldwell wrote:
> >> Here are the latest Lagarith patches. The only changes to the
> >> rangecoder from the previous patch are cosmetic (K&R formatting). The
> >> decoder includes Loren's floating point emulation. Thanks to everyone,
> >> and especially Loren and Reimar, for all their help.
> >
> >> + int prob[258]; /*!< Table of cumulative probability for each symbol. */
> >
> > this is filled with uint32_t giving nice negative probs
>
> Fixed.
>
> >> +/* compute the 52bit mantissa of 1/(double)denom */
> >> +static uint64_t softfloat_reciprocal(uint32_t denom)
> >
> > comment not doxygen compatible
> > and this really needs some explanation of why it is used instead of
> > 1.0/denom in the doxy
>
> Updated. It would be nice if Loren could comment on these descriptions.
>
> >> +static uint32_t lag_decode_prob(GetBitContext *gb)
> >> +{
> >> + static const uint8_t series[] = { 1, 2, 3, 5, 8, 13, 21, 34 };
> >
> > where is series[7] accessed?
>
> It isn't. The table was copied from reference. Removed.
>
> >> + if (bits <= 0)
> >> + return 0;
> >> + if (bits > 31) {
> >> + /* This is most likely an error, just use the first 32 bits */
> >> + bits = 31;
> >> + }
> >
> > are these not error conditions that warrent a error message and failure?
>
> In the case of bits==0, no. Any other case (bits < 0 or bits >31),
> yes. This should be fixed now.
>
> >> +
> >> + value = get_bits_long(gb, bits);
> >> + value |= 1 << bits;
> >> +
> >> + return value - 1;
> >> +}
> >> +
> >> +static int lag_read_prob_header(lag_rac *rac, GetBitContext *gb)
> >> +{
> >> + int i, j;
> >> + int prob = 0;
> >> + int scale_factor = 0;
> >> + unsigned cumul_prob = 0;
> >> + unsigned cumulative_target = 1;
> >> + unsigned scaled_cumul_prob = 0;
> >> +
> >> + rac->prob[0] = 0;
> >> + rac->prob[257] = UINT_MAX;
> >> + /* Read probabilities from bitstream */
> >> + for (i = 1; i < 257; i++) {
> >> + rac->prob[i] = lag_decode_prob(gb);
> >
> >> + cumul_prob += rac->prob[i];
> >
> > can overflow
>
> In this case the reference decoder would also overflow. Would you
> prefer to match the reference, or not overflow?
I dont mind either way as long as its no exploitable, cant crash and doesnt
end in an infinite loop.
>
> >> + if (!rac->prob[i]) {
> >
> >> + prob = lag_decode_prob(gb);
> >> + if (i + prob >= 258)
> >> + prob = 257 - i;
> >
> > integer overflow, segfault below
>
> Fixed.
>
> > and isnt this check supposed to be an error condition instead of silent
> > cliping?
>
> In a perfect world, yes. Again, just matching the behavior of the
> reference decoder.
>
[...]
> > also getting rid of *step and doing +=step might be faster
> > and then variables like zeros could be tried to be moved onto the
> > stack
>
> This is what I originally had, however it introduces too many
> divisions in the calculation of the zero runs.
i is used without *step only for a comparission against width, width
surely can be multiplied by step to avoid a division there.
Am i missing something?
[...]
> +// decodes a byte
> +static inline int lag_get_rac(lag_rac *l)
// is not picked up by doxygen yet i think ...
> +{
> + unsigned range_scaled, low_scaled, div;
> + int val;
> + uint8_t shift;
> +
> + lag_rac_refill(l);
> +
> + range_scaled = l->range >> l->scale;
> +
> + if (l->low >= range_scaled * l->prob[255]) {
> + val = 255;
> + l->range -= range_scaled * l->prob[255];
> + } else {
> + /* val = 0 is frequent enough to deserve a shortcut */
> + if (l->low < range_scaled * l->prob[1]) {
> + val = 0;
isnt 0 more common than 255? and should thus be checked before on a
shorter code path?
> + } else {
> + /* FIXME __builtin_clz is ~20% faster here, but not allowed in generic code. */
> + shift = 30 - av_log2(range_scaled);
table[ l->range>>24 ]
could be used and would need just 128 entries or so
> + div = ((range_scaled << shift) + (1 << 23) - 1) >> 23;
> + /* low>>24 ensures that any cases too big for exact FASTDIV are
> + * under- rather than over-estimated
> + */
> + low_scaled = FASTDIV(l->low - (l->low >> 24), div);
> + shift -= 23 + l->hash_shift;
23 + l->hash_shift can be factored out of the function as hash_shift is a
"constant"
> + shift &= 31;
> + low_scaled = (low_scaled << shift) | (low_scaled >> (32 - shift));
> + /* low_scaled is now a lower bound of low/range_scaled */
> + val = l->range_hash[(uint8_t) low_scaled];
> + while (l->low >= range_scaled * l->prob[val + 1])
> + val++;
is there some (small) bound on the number of iterations of this?
if so it could be replaced by a if() or 2 ...
[...]
> +/**
> + * Decode a frame.
> + * @param avctx codec context
> + * @param data output AVFrame
> + * @param data_size size of output data or 0 if no picture is returned
> + * @param avpkt input packet
> + * @return number of consumed bytes on success or negative if decode fails
> + */
> +static int lag_decode_frame(AVCodecContext *avctx,
> + void *data, int *data_size, AVPacket *avpkt)
> +{
> + const uint8_t *buf = avpkt->data;
> + int buf_size = avpkt->size;
> + LagarithContext *l = avctx->priv_data;
> + AVFrame *const p = &l->picture;
> + uint8_t frametype = 0;
> + uint32_t offset_gu = 0, offset_bv = 0, offset_ry = 9;
> +
> + AVFrame *picture = data;
> +
> + if (!l->avctx)
> + l->avctx = avctx;
that can be done in lag_decode_init()
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Rewriting code that is poorly written but fully understood is good.
Rewriting code that one doesnt understand is a sign that one is less smart
then the original author, trying to rewrite it will not make it better.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091024/5829cdb2/attachment.pgp>
More information about the ffmpeg-devel
mailing list