[FFmpeg-devel] Some IWMMXT functions for libavcodec
Michael Niedermayer
michaelni
Mon May 12 19:47:58 CEST 2008
On Mon, May 12, 2008 at 08:05:01PM +0400, Dmitry Antipov wrote:
> Hello,
>
> here are some libavcodec DSP stuff I've developed for XScale CPU with Intel
> WMMX support.
>
> (At http://78.153.153.8/tmp/dspwmmx.c, there is also a small standalone
> validation & benchmark
> program for these functions).
Please post the benchmarks results to the list as well ...
[...]
> +static int vsad_intra16_iwmmxt(void *c, uint8_t *pix, uint8_t *dummy, int stride, int h)
> +{
> + int s, i;
> +
> + for (s = 0, i = 1; i < h; i++) {
> + asm volatile("wldrd wr0, [%1] \n\t"
> + "wldrd wr1, [%2] \n\t"
> + "wsadbz wr1, wr0, wr1 \n\t"
> + "wldrd wr0, [%1, #8] \n\t"
> + "wldrd wr2, [%2, #8] \n\t"
> + "wsadbz wr2, wr0, wr2 \n\t"
> + "waddw wr1, wr1, wr2 \n\t"
> + "textrmsw r1, wr1, #0 \n\t"
> + "add %0, %0, r1 \n\t"
> + : "+r"(s)
> + : "r"(pix), "r"(pix + stride)
> + : "r1");
> + pix += stride;
> + }
doing loops in C around asm like that is inefficient
[...]
> +static int pix_abs8_y2_iwmmxt(void *v, uint8_t *pix1, uint8_t *pix2, int line_size, int h)
> +{
> + int s, i;
> +
> + for (s = 0, i = 0; i < h; i++) {
> + asm volatile("wldrd wr0, [%2] \n\t"
> + "wldrd wr1, [%3] \n\t"
i dont know the wmmx instructions either but i do know that one of
the loads is redudnant
Please keep in mind that a single suboptimal instruction means that
a patch is rejected! So please try to write optimal code even if
there is noone around who knows the instruction set
> + "wavg2br wr0, wr0, wr1 \n\t"
> + "wldrd wr1, [%1] \n\t"
> + "wsadbz wr1, wr1, wr0 \n\t"
> + "textrmsw r1, wr1, #0 \n\t"
> + "add %0, %0, r1 \n\t"
Is this faster than a waddw ?
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
The worst form of inequality is to try to make unequal things equal.
-- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080512/9eba59fe/attachment.pgp>
More information about the ffmpeg-devel
mailing list