> >if one removes the crippling
> >-fno-tree-vectorize
> Yes, I think a config option to turn this flag on (like the unsafe
> bitstream reader) would be good. Defaulting to off by default if it doesn't
> break anything for at least a few people (and compilers) who test it. It's
> not a big performance impact but every little bit counts nowadays.

FWIW, I recently (i.e. 2 days ago) did some tests with auto-vectorization
and a few compilers. Fortunately, none of the compilers I tested caused any
miscompilation, when purely measured by FATE:

clang3.7 4m3.034s
gcc5vectorize 5m50.637s (1.14x gcc5)
gcc5 5m7.262s
gcc4.9vectorize 5m29.669s (1.11x gcc4.9)
gcc4.9 4m54.602s
gcc4.8vectorize 5m18.848s (1.09x gcc4.8)
gcc4.8 4m53.940s

clang3.7 3m13.923s
gcc5vectorize 3m5.988s (0.980x gcc5)
gcc5 3m9.618s
gcc4.9vectorize 3m12.880s (0.983x gcc4.9)
gcc4.9 3m16.563s
gcc4.8vectorize 3m10.321s (0.993x gcc4.8)
gcc4.8 3m11.608s

Tested with:
- Debian jessie/stable/8.2
- Dual-core Haswell i7 ultra low voltage
- clang-3.7 3.7.0-svn251177-1~exp1 (from the offical clang apt repo)
- gcc-5 (Debian 5.2.1-22) 5.2.1 20151010 (Debian testing stock)
- gcc-4.9 (Debian 4.9.2-10) 4.9.2 (Debian stable stock)
- gcc-4.8 (Debian 4.8.4-1) 4.8.4 (Debian stable stock)

Note that FATE is probably the worst benchmark one can find, but it does
show something.

Some observations:

- GCC vectorization slows down compilation A LOT in all versions. The newer
the worse.
- If you are developing, use clang, and DON'T use GCC 5 with vectorization.
- For release builds, an option to turn it on (or rather to not turn it
off) would be helpful; but if you really care about performance _that_ much
then you should probably use some other compilers instead.

FYI, as I have told Ganesh so in our private exchanges, I did also test
vectorization on GCC 4.6 on a Ubuntu 12.04/Precise box, which miscompiled
the code hilariously, _and_ made the code slower, just as illustrated in
Mans's commit message.


