[Ffmpeg-cvslog] r5975 - in trunk/libavcodec: dsputil.c dsputil.h i386/dsputil_mmx.c vorbis.c vorbis.h

Rich Felker dalias
Fri Aug 18 18:49:00 CEST 2006


On Thu, Aug 10, 2006 at 09:06:26PM +0200, lorenm wrote:
> +static void vector_fmul_3dnow(float *dst, const float *src, int len){
> +    long i;
> +    len >>= 1;
> +    for(i=0; i<len; i++) {
> +        asm volatile(
> +            "movq  %0, %%mm0 \n\t"
> +            "pfmul %1, %%mm0 \n\t"
> +            "movq  %%mm0, %0 \n\t"
> +            :"+m"(dst[i*2])
> +            :"m"(src[i*2])
> +            :"memory"
> +        );
> +    }
> +    asm volatile("femms");
> +}

Have you read the asm gcc generates? I would guess (have not tested
however) that writing the loop in asm would be faster than gcc's for
loops... Writing the loop yourself also allows unrolling the loop
slightly and interleaving paired iterations, or if nothing else just
interleaving the pointer increment ops with the 3dnow ops.

Rich





More information about the ffmpeg-cvslog mailing list