[FFmpeg-devel] [PATCH] Further optimization of base64 decode using AV_WB32.
Reimar Döffinger
Reimar.Doeffinger at gmx.de
Sat Jan 21 18:13:19 CET 2012
On Sat, Jan 21, 2012 at 05:56:32PM +0100, Michael Niedermayer wrote:
> On Sat, Jan 21, 2012 at 05:52:27PM +0100, Reimar Döffinger wrote:
> > This is somewhat questionable.
> > The biggest issue is that av_bswap32 is not replaced
> > with our asm version on gcc 4.5 or newer.
> > This causes gcc to generate horrible code that is slower
> > than the unoptimized variant.
> > Old: 248852 decicycles
> > New with gcc's attempt at av_bswap32: 256576 decicycles
> > New with our bswap32: 200260 decicycles
> [...]
> > diff --git a/libavutil/x86/bswap.h b/libavutil/x86/bswap.h
> > index 52ffb4d..aa39d97 100644
> > --- a/libavutil/x86/bswap.h
> > +++ b/libavutil/x86/bswap.h
> > @@ -37,7 +37,7 @@ static av_always_inline av_const unsigned av_bswap16(unsigned x)
> > }
> > #endif /* !AV_GCC_VERSION_AT_LEAST(4,1) */
> >
> > -#if !AV_GCC_VERSION_AT_LEAST(4,5)
> > +#if 1 || !AV_GCC_VERSION_AT_LEAST(4,5)
> > #define av_bswap32 av_bswap32
> > static av_always_inline av_const uint32_t av_bswap32(uint32_t x)
> > {
>
> also make sure -cpu/arch/tune is set so gcc is allowed to use bswap
> (its 486+) so not possible for gcc to use on strict x86
It is a x86_64 build, so I'd hope that gcc will not try to "optimize"
of 486 on that...
More information about the ffmpeg-devel
mailing list