[Ffmpeg-devel] PATCH Blackfin optimized byte swapping mechanism

Michael Niedermayer michaelni
Mon Apr 23 15:04:57 CEST 2007


Hi

On Tue, Apr 17, 2007 at 08:49:40AM -0400, Marc Hoffman wrote:
> Michael Niedermayer writes:
>  > Hi
>  > 
>  > On Tue, Apr 17, 2007 at 07:40:47AM -0400, Marc Hoffman wrote:
>  > Content-Description: message body text
>  > > 
>  > >  > Low level bswap primitive for the Blackfin Architecture.
>  > > 
>  > > sorry mangled patch wrong encoding last time.
>  > 
>  > what advantage do these functions have over the default?
>  > are they faster? if so you should provide some benchmarks
> 
> Sorry about the top post please forgive me
> 
> The current 32bit byte swap routine produces this code sequence
> 
> 
>         R1 = 255 (X);
>         R1 <<= 16;
>         R1 = R0 & R1;
>         R2 = R0 >> 24;
>         R1 >>= 8;
>         R2 = R2 | R1;
>         R1 = 65280 (Z);
>         R1 = R0 & R1;
>         R1 <<= 8;
>         R0 <<= 24;
>         R1 = R1 | R0;
>         R2 = R2 | R1;
> 
>         R0 = R2; <<--- result 
> 
> The suggested replacement is
> 
>     asm("%1 = %0 >> 8 (V);\n\t"
>         "%0 = %0 << 8 (V);\n\t"
>         "%0 = %0 | %1;\n\t"
>         "%0 = PACK(%0.L, %0.H);\n\t"
> 
> So I guess this is about 300% improvement in performance for this function.

guess is good, hard benchmark is better, its just 5min work to write a
loop of bswap and do a time myprog
also dont forget to set proper -mcpu / -march and -O3 with gcc

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070423/7e280d31/attachment.pgp>



More information about the ffmpeg-devel mailing list