[FFmpeg-devel] [ping] [PATCH] mmx implementation of vc-1 inverse transformations

Michael Niedermayer michaelni
Wed Oct 6 16:32:43 CEST 2010


On Wed, Oct 06, 2010 at 10:08:26AM -0400, Ronald S. Bultje wrote:
> Hi,
> 
> as a start: I agree with Jason, I'd prefer new code like this to be
> written in yasm, but I won't object since I didn't write it.
> 
> 2010/9/30 Yuriy Kaminskiy <yumkam at mail.ru>:
> [..]
> > Index: libavcodec/vc1dec.c
> > @@ -2086,7 +2086,7 @@ static int vc1_decode_p_block(VC1Context
> >          for(j = 0; j < 2; j++) {
> >              last = subblkpat & (1 << (1 - j));
> >              i = 0;
> > -            off = j * 32;
> > +            off = s->dsp.vc1_inv_trans_8x4_transposed ? j * 4 : j * 32;
> >              while (!last) {
> >                  vc1_decode_ac_coeff(v, &last, &skip, &value, v->codingset2);
> >                  i += skip;
> 
> I would prefer if we wouldn't add random fields in DSPContext, this
> will quickly go crazy. Better check (like h264 does) whether the
> function is the C function (which then needs to be exported in a
> header) and do this behaviour based on that.

no strong oppinion here but checking against the C function is a hack


[...]
>
> > +    TRANSPOSE4(%%mm5,%%mm1,%%mm0,%%mm3,%%mm4)
> > +    STORE4(q,0x10,0x00(%0),%%mm5,%%mm3,%%mm4,%%mm0)
> > +
> > +    LOAD4(q,0x10,0x08(%0),%%mm6,%%mm5,%%mm7,%%mm1)
> > +    TRANSPOSE4(%%mm6,%%mm5,%%mm7,%%mm1,%%mm2)
> > +    STORE4(q,0x10,0x08(%0),%%mm6,%%mm1,%%mm2,%%mm7)
> 
> > +    TRANSFORM_8X4_ROW_H1
> > +    (
> > +        q,q,
> > +        0x00(%0),0x20(%0),0x08(%0),0x38(%0),
> > +        %%mm0,%%mm1,%%mm2,%%mm3,%%mm4,%%mm5,%%mm6,%%mm7
> > +    )
> 
> This is even worse, this is absolutely unreadable. :-). Same for all
> other functions in this file.
> 
> This function does the same operation 4x, can you create a loop
> without slowing it down significantly? (gcc should unroll by itself,
> but then source size is 4x smaller).

iam against using C loops in asm, it will cause endless problems

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101006/7c168e40/attachment.pgp>



More information about the ffmpeg-devel mailing list