[FFmpeg-devel] [PATCH] SPARC VIS simple_idct
Sat Aug 25 13:52:49 CEST 2007
Saturday 25 August 2007 08:02-kor Michel Lespinasse ezt ?rta:
> One can make an accurate enough IDCT using VIS, the 8x16 bit multiplies
> make that very hard but not quite entirely impossible.
> I finally dug out the old test code I had written in september 2003,
> at the time david miller was interested to convert that to asm and then
> he got busy with other things. It works on columns, then transposes
> the table and does another pass. David thought he could make it faster
> than the VIS one, but hey, talk is cheap :) The test code is in C but
> I believe the muls/mulu functions match what VIS implements. idct() is
> what you'd want to be fast, idct_vis() is the C function I used to hook
> this into my IEEE1180 test framework, it reorders input coefficients
> and preshifts them by 4 (libmpeg2 prescales IDCT during stream parsing).
> Don't know if there is any interest or how it compares with the simple-idct
> derived code - but, it is (barely) IEEE1180 compliant and does not use
> 32-bit multiplies.
I glanced over it and speed wise it should be roughly the same as my
simple_idct_vis (about the same amount of operations).
But unfortunatelly I see a problem: you are using unsigned multiplies, which
are AFAIK not available on SPARC. This also means that the code might not
actually comply with ieee1180, because you are using the sign bit for data,
but you can't.
I hope I didn't write something very stupid, though
More information about the ffmpeg-devel