[FFmpeg-devel] [PATCH] generic infrastructure to support IWMMXT2

Fri May 23 15:07:16 CEST 2008

On Friday 23 May 2008, Antipov Dmitry wrote:
> 22.05.08, 23:55, "Siarhei Siamashka" <siarhei.siamashka at gmail.com>:
> > Your change actually enforces the use of iwmmxt and iwmmxt2 by default
> > for everyone unless it is explicitly disabled with "--disable-iwmmxt*".
> > It is different from the logic that has been used so far (I'm not
> > implying that the current logic is perfect).
>
> I agree that two (mutually exclusive in most of the cases) types of logic
> are applicable here - a) choose the best automatically, downgrade if user
> wants and b) follow user's options strictly, assuming the user knows what's
> going on here.
>
> I believe that the 'end-user' software should use a) and not b). For
> example, MPlayer - being natively configured on x86, it's configure will
> try to select the best from MMX/MMXEXT/3DNOW/3DNOWEXT/SSE/SSE2/etc., and
> advanced user may specify '--disable-THING'.
>
> This patch suspects that FFmpeg is 'end-user' software (:-)), so it tries
> to follow a). If the compiler/assembler understands -march=iwmmxt2, let's
> use it. If not, let's try -march=iwmmxt. Advanced user may issue
> ./configure --help and enable/disable things according to personal needs.

Well, end users don't need the instructions supported by their toolchain
(modern toolchain support all the architecture extensions by the way), but 
only the instructions supported by their devices for practical reasons.

Your patch requires me to explicitly specify --disable-iwmmxt and 
--disable-iwmmxt2 configure options when I want to compile FFmpeg for 
ARM11, otherwise I will get a non-working build which will crash on my 
device. The other way around, if a similar stuff gets added to configure 
for ARMv6 support, you will be also forced to explicitly specify 
--disable-armv6 configure option if you want to compile FFmpeg for your
PXA310 device, or you are screwed up as well.

End users here are more likely the developers who use some kind of SDK 
for crosscompiling software to some kind of embedded device. The toolchain
included in SDK may support predefined -march/-mtune options using spec file
for user convenience.  If -march option is not predefined, it is up to the
user to provide correct optimization options if he cares for performance (if
he does not, a generic armv4 build will be compiled). Enabling the use of some
instructions just because the toolchain supports them is IMHO a bad choice
and a way to get lots of troubles.

Also compiling-in support for all the instructions and adding runtime
autodetection like it is done for x86 is not practical on ARM (even if we
could have an absolutely reliable, portable and non-ugly way of getting the
list of supported extensions from userspace). Just because you may want to use
ARMv6 REV instruction (changing endianness) and some macros from mathops.h all
over the code. Another more important choice is whether you want to use
hardware floating point math or software emulation. You can't easily and
efficiently support them both in a single FFmpeg binary.

Please just make your configure patch aligned with the rest of ARM instruction
set extensions detection for now. You may provide some improvements to
all this stuff in a separate patch.

> > This part does not make much sense. The problem was that the toolchain
> > actually silently generated wrong code if the user specified
> > "-march=iwmmxt" (instead of "-march=iwmmxt2") and this instruction is
> > used somewhere in inline assembly code (a bug in HAVE_IWMMXT/HAVE_IWMMXT2
> > ifdef selection in the code or if the user also enforced --enable-iwmmxt2
> > for whatever reason and it got enabled).
>
> This is the problem, really. Actually, we need to run cross-objdump (most
> probably it's ${cross_prefix}${objdump}) here to verify that the generated
> code is correct.

We probably just need to submit a bugreport to binutils maintainers. The 
effect on FFmpeg is mostly that you don't have a convenient compiler error 
if you accidently use some WLDRD with register offset addressing under
HAVE_IWMMXT define (in the code intended for old cores).

> > If the toolchain is old enough and does not support iwmmxt2 at all,
> > it will fail on the iwmmxt2 support check anyway (the part which checks
> > for "wavg4" instruction). That is if I understand current situation
> > correctly.
>
> This is another problem. For example, I have two toolchains for now:
>  - old, with gcc 3.4.3, configured with '--with-cpu=iwmmxt
> --with-arch=iwmmxt2 --with-tune=iwmmxt2'
>  - new, with gcc 4.2.0, configured with '--with-cpu=arm10tdmi
> --with-interwork --with-arch=armv5 --with-tune=arm10tdmi'
>
> So, the first one is targeted for XScale core with WMMX2 by default - being
> run without arguments, it does '-march=iwmmxt2' automatically. But new
> toolchain needs explicit '-march=iwmmxt2' in the command line since it's
> targeted to another CPU by default.
>
> The problem is that even the old toolchain supports WMMX2 - it passes
> 'wavg4' test. But it's broken and can't compile pre-increment expressions:
>
> /tmp/ccC9bIBm.s: Assembler messages:
> /tmp/ccC9bIBm.s:23: Error: immediate expression expected -- `wldrd
> wr0,[r0,r1]!'
>
> So, this piece of configure protects the user who wants WMMX2 from using
> too old (and so broken) toolchain.

OK, so it's a bit more broken than I expected :)

It's probably better to use both WAVG4 and WLDRD instructions with register
offset addressing in IWMMXT2 detection (so that this check will only succeed
if the toolchain can compile both new instructions and a new addressing mode
for old instructions). That will save a bit of code. Something like:

enabled iwmmxt2 && check_asm iwmmxt2 \
    '"wavg4 wr0, wr1, wr2 \n wldrd wr0, [r0, r1]!"'

-- 
Best regards,
Siarhei Siamashka