[FFmpeg-devel] [PATCH] VC1: merge put/add_pixels with IDCT8x8.

Sat Feb 19 19:25:00 CET 2011

On Sat, Feb 19, 2011 at 01:04:11PM -0500, Ronald S. Bultje wrote:
> Hi,
> 
> On Sat, Feb 19, 2011 at 12:45 PM, Kostya <kostya.shishkov at gmail.com> wrote:
> > On Sat, Feb 19, 2011 at 12:27:16PM -0500, Ronald S. Bultje wrote:
> >> Just quick note here:
> >>
> >> On Sat, Feb 19, 2011 at 12:20 PM, Ronald S. Bultje <rsbultje at gmail.com> wrote:
> >> > -static void vc1_put_block(VC1Context *v, DCTELEM block[6][64])
> >> [..]
> >> > - ? ?if(v->rangeredfrm) {
> >> > - ? ? ? ?int i, j, k;
> >> > - ? ? ? ?for(k = 0; k < 6; k++)
> >> > - ? ? ? ? ? ?for(j = 0; j < 8; j++)
> >> > - ? ? ? ? ? ? ? ?for(i = 0; i < 8; i++)
> >> > - ? ? ? ? ? ? ? ? ? ?block[k][i + j*8] = (block[k][i + j*8] - 64) << 1;
> >> > -
> >> > - ? ?}
> >> [..]
> >> > ?static void vc1_decode_i_blocks(VC1Context *v)
> >> [..]
> >> > + ? ?idct8x8_fn = v->vc1dsp.vc1_inv_trans_8x8_put[(!!v->rangeredfrm) | ((v->pq < 9 || !v->overlap) << 1)];
> >> [..]
> >> [..]
> >> > - ? ? ? ? ? ? ? ?v->vc1dsp.vc1_inv_trans_8x8(s->block[k]);
> >> > - ? ? ? ? ? ? ? ?if(v->pq >= 9 && v->overlap) {
> >> > - ? ? ? ? ? ? ? ? ? ?for(j = 0; j < 64; j++) s->block[k][j] += 128;
> >> > - ? ? ? ? ? ? ? ?}
> >> > + ? ? ? ? ? ? ? ?if(k > 3 && (v->s.flags & CODEC_FLAG_GRAY)) continue;
> >> > + ? ? ? ? ? ? ? ?idct8x8_fn(dst[k],
> >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? k & 4 ? s->uvlinesize : s->linesize,
> >> > + ? ? ? ? ? ? ? ? ? ? ? ? ? s->block[k]);
> >> > ? ? ? ? ? ? }
> >> >
> >> > - ? ? ? ? ? ?vc1_put_block(v, s->block);
> >>
> >> I know this isn't the same thing. The old code does this:
> >>
> >> read coeffs
> >> do idct
> >> if (v->pq >= 9 && v->overlap) add 128 to each of them
> >> if (rangered) sub 64 from each of them, <<= 1
> >> put_pixels_clamped();
> >>
> >> the new code does this, essentially:
> >> read coeffs
> >> do idct
> >> if (v->pq >= 9 && v->overlap) add 128 to each of them
> >> if (rangered) { if (v->pq >= 9 && v->overlap) { sub 64 from each} <<= 1 }
> >> put_pixels_clamped();
> >
> > That's wrong:
> > clause 8.1.1.4 specified range restoration is performed like
> > Y[n] = CLIP ((Y[n] ? 128) * 2 + 128)
> > when range reduction flag is set at the sequence level
> >
> > Overlapping (and adding 128) is employed when it's signalled, i.e.
> > v->overlap && v->pq >= 9,
> > and it's totally independent from range reduction.
> 
> If you look at vc1_decode_i_blocks_adv(), the 128 is added
> unconditionally. Then rangered subtracts the same 128, does the actual
> range change (<<= 1) and then re-adds the 128.

> Now vc1_decode_i_blocks() only adds the 128 only if (v->pq >= 9 &&
> v->overlap), shouldn't that mean that the subtraction and re-addition
> in the if (rangeredfrm) should only be done if that condition was
> actually true?

That comes from clause 8.1.3.10 - "For advanced profile I frames, or if
overlap filtering is used in simple and main profile I frames, the constant
128 shall be added prior to clamping."

vc1_decode_i_blocks() is used only for simple/main/complex profiles,
vc1_decode_i_blocks_adv() is used (obviously) for advanced profile.

Range reduction is present only for main (and maybe complex) profile.

> Otherwise the whole thing makes no sense to me.

What do you expect from M$ standard? At least it's implementable.

> Ronald