[FFmpeg-devel] [RFC] Loop unrolling in C code for 'vector_fmul_*' functions
Fri Jan 11 04:16:43 CET 2008
On Thu, Jan 10, 2008 at 09:23:19PM -0500, Rich Felker wrote:
> On Thu, Jan 10, 2008 at 09:39:47PM +0100, Michael Niedermayer wrote:
> > > Tweaking C code, performance can be improved quite a lot
> > > ('vector_fmul_c_other_unrolled' vs. 'vector_fmul_c_unrolled').
> > > But such unnesessarily cluttering code because of inefficient compilers is not
> > > a good option. Anyway, probably at least just loops can be unrolled to help
> > > the compiler do its job? The compiler itself does not know that 'len is a
> > > multiple of 8' and manual loops unrolling seems to be reasonable.
> > Add a assert((len & 7) == 0); and the compiler can know it.
> I doubt it will use it though.
I didnt say that current gcc will use it, allthough i neither have any
information saying that it wont use it.
> Instead why not mask off the low bits
> or right-shift it to be a direct iteration count? Then it's obvious
Because that makes the code slower (and uglier). Also noone said that gcc
would unroll it if any of these were done ...
> for the compiler. While I agree that the coder should not have to
> hand-schedule instructions in C code, I think it's quite reasonable
> for the coder to write code in a way that minimizes the amount of
> intelligence needed to generate good asm. Not only does this help the
> compiler; following that principle also tends to make assumptions more
> clear to humans reading the code.
putting a & ~7 there would make the reader belive that the bits may be non
IMHO the assert() is more clear
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Observe your enemies, for they first find out your faults. -- Antisthenes
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel