[FFmpeg-devel] [PATCH] SPARC VIS simple_idct try #2

Balatoni Denes dbalatoni
Wed Aug 22 11:45:57 CEST 2007


Hi!

Wednesday 22 August 2007 03:47-kor Michael Niedermayer ezt ?rta:
> what happens with the regression tests if its forced to be used where the
> normal C simple idct normally is?
>
> and what does dct-test.c output for the idct?
>

Well I don't think it would overflow, because than simple_idct would overflow 
too.

I hope I can try at least dct-test.c today.

> > +static DECLARE_ALIGNED_8(int16_t, coeffs[28]) = {
> > +    32138, 32138, 32138, 32138,
> > +    30274, 30274, 30274, 30274,
> > +    27246, 27246, 27246, 27246,
> > +    23170, 23170, 23170, 23170,
> > +    18205, 18205, 18205, 18205,
> > +    12540, 12540, 12540, 12540,
> > +     6393,  6393,  6393,  6393
> > +};
>
> const static

I will fix that.

> [...]
>
> > +#define IDCT4ROWS(in, shift, label, s1, s2, bi, ma) \
> > +    /* order input */\
> > +        "ld [" in "], %%f0           \n\t"\
> > +        "ld [" in "+4], %%f4         \n\t"\
>
> well i dont know sparc asm at all but dont you read a few things in at the
> top and then just overwrite these registers

No, fpackfix converts two 32 bit values to two 16 bit values.  This is the 
sparc documentation I am using (but this part is actually tested):
http://www.fujitsu.com/downloads/PRMPWR/JPS1-R1.0.4-Common-pub.pdf

> also you permute the input explicitly instead of setting
> idct_permutation_type properly

In the row iteration it is not only permuted, but also shifted right four 
bits. But there is no shift instruction. So if you know a significantly 
faster way to shift the input right four bits, than do tell me.

> > +void ff_simple_idct_put_vis(uint8_t *dest, int line_size, DCTELEM *data)
> > { +    ff_simple_idct_vis(data);
> > +    ff_put_pixels_clamped_vis(data, dest, line_size);
> > +}
> > +
> > +void ff_simple_idct_add_vis(uint8_t *dest, int line_size, DCTELEM *data)
> > { +    ff_simple_idct_vis(data);
> > +    ff_add_pixels_clamped_vis(data, dest, line_size);
> > +}
>
> check that gcc inlines these 4 calls, if not do something so it does, they
> should be inlined

I will check that.

> [...]

bye
Denes




More information about the ffmpeg-devel mailing list