[FFmpeg-devel] [PATCH] fix for roundup issue 2127

Michael Niedermayer michaelni
Sun Jan 2 05:33:12 CET 2011


On Sat, Jan 01, 2011 at 10:52:41PM -0500, Daniel Kang wrote:
> On Sat, Jan 1, 2011 at 10:33 PM, Michael Niedermayer <michaelni at gmx.at>wrote:
> 
> >  > I added the casts because I thought it is clearer this way. Removing
> > the
> > > casts but keeping the transpose4x4 function arguments the same does not
> > > give the errors. Should I do this instead?
> >
> > yes
> 
> 
> I have updated the patch without the casts.

>  dsputil_mmx.h |   36 ++++++++++++++++++------------------
>  1 file changed, 18 insertions(+), 18 deletions(-)
> 024259a0e21988b36813d38fe935cc685232cd73  fix.diff
> From 0478bebde326fe119961f308b49ca49b721ac794 Mon Sep 17 00:00:00 2001
> From: Daniel Kang <daniel.d.kang at gmail.com>
> Date: Wed, 29 Dec 2010 22:06:38 -0500
> Subject: [PATCH] 2127 fix
> 
> ---
>  libavcodec/x86/dsputil_mmx.h |   36 ++++++++++++++++++------------------
>  1 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/libavcodec/x86/dsputil_mmx.h b/libavcodec/x86/dsputil_mmx.h
> index d9c2f44..d7a828d 100644
> --- a/libavcodec/x86/dsputil_mmx.h
> +++ b/libavcodec/x86/dsputil_mmx.h
> @@ -24,6 +24,7 @@
> 
>  #include <stdint.h>
>  #include "libavcodec/dsputil.h"
> +#include "libavutil/x86_cpu.h"
> 
>  typedef struct { uint64_t a, b; } xmm_reg;
> 
> @@ -94,32 +95,31 @@ extern const double ff_pd_2[2];
>      SBUTTERFLY(a,c,d,dq,q) /* a=aeim d=bfjn */\
>      SBUTTERFLY(t,b,c,dq,q) /* t=cgko c=dhlp */
> 
> -static inline void transpose4x4(uint8_t *dst, uint8_t *src, int dst_stride, int src_stride){
> +static inline void transpose4x4(uint8_t *dst, uint8_t *src, x86_reg dst_stride, x86_reg src_stride){
>      __asm__ volatile( //FIXME could save 1 instruction if done as 8x4 ...
> -        "movd  %4, %%mm0                \n\t"
> -        "movd  %5, %%mm1                \n\t"
> -        "movd  %6, %%mm2                \n\t"
> -        "movd  %7, %%mm3                \n\t"
> +        "movd  (%3), %%mm0              \n\t"
> +        "movd  (%3,%1), %%mm1           \n\t"
> +        "movd  (%3,%1,2), %%mm2         \n\t"
> +        "lea   (%1,%1,2), %1            \n\t"
> +        "movd  (%3,%1), %%mm3           \n\t"

something like this:
"movd  (%3), %%mm0              \n\t"
"add   %1, %3                   \n\t"
"movd  (%3), %%mm1              \n\t"
"movd  (%3,%1), %%mm2           \n\t"
"movd  (%3,%1,2), %%mm3         \n\t"

would replace lea by add which is faster on some CPUs

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Incandescent light bulbs waste a lot of energy as heat so the EU forbids them.
Their replacement, compact fluorescent lamps, much more expensive, dont fit in
many old lamps, flicker, contain toxic mercury, produce a fraction of the light
that is claimed and in a unnatural spectrum rendering colors different than
in natural light. Ah and we now need to turn the heaters up more in winter to
compensate the lower wasted heat. Who wins? Not the environment, thats for sure
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20110102/26866084/attachment.pgp>



More information about the ffmpeg-devel mailing list