[FFmpeg-devel] [PATCH/RFC] intreadwrite.h rewrite

Måns Rullgård mans
Mon Apr 6 12:59:25 CEST 2009


Luca Barbato <lu_zero at gentoo.org> writes:

> M?ns Rullg?rd wrote:
>> I would like to propose a rework of intreadwrite.h.  This new version
>> supports per-arch implementations of the various macros allowing us to
>> take advantage of special instructions or other properties the
>> compiler does not know about.
>> ARMv6 and later support unaligned loads and stores for single
>> word/halfword but not double/multiple.  GCC is ignorant of this and
>> will always use bytewise accesses for unaligned data.  Casting to an
>> int32_t pointer is dangerous since a load/store double or multiple
>> instruction might be used (this happens with some code in FFmpeg).
>> Implementing the AV_[RW]* macros with inline asm using only supported
>> instructions gives fast and safe unaligned accesses.  This gives an
>> overall speedup of up to 10% in some cases.
>> PPC is normally big endian but has special little endian load/store
>> instructions.  Using these avoids a separate byteswap.  This makes the
>> vorbis decoder about 5% faster.  Not much else uses little-endian
>> read/write extensively.  GCC generates horrible PPC code for the
>> default AV_[RW]B64 (which uses a packed struct), so I have overridden
>> it with a plain pointer cast.
>> For other architectures the definitions of these macros should remain
>> unchanged.
>> I'm attaching the complete files instead of a diff since the diff
>> since this is largely a rewrite.  Sorry for attaching three files with
>> same name; I'm sure you can work out which is which.
>
> Nice, I assume HAVE_LDBRX comes from a configure check, isn't it?

Yes, it will be set in configure.  The stupid gnu assembler lets it
through no matter what CPU is selected though, so I have to enable it
manually for those that have it.  Does any CPU other than the Cell
have this instruction?

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list