[FFmpeg-devel] PATCH: av_strtod
Tue Jun 2 05:13:51 CEST 2009
>> Well, for that file instead of storing pointers to labels and doing
repeatative jumps in the loop I wrote mlpdsp_template.c that creates
using C preprocessor 8*4 functions mlp_filter_channel_x86_X_Y and I use
table of function pointers and call a function based on the X and Y.
That was just about the only way I could make that function work on for
> Have you benchmarked the code?
> It was much slower (even slower than the for() loops in C) to have 8*4
> functions because of instruction cache misses.
I tested again with attached patch, and it's not "much" slower like I
said before, but actually ~2% slower for x86_64 and ~6% slower for
x86_32. It adds 8k of object code.
Besides, you don't "have to" make it work on intel compiler. It is just
an optimization. Optimizations can easily be disabled if the compiler
doesn't like them (for example x86 optimizations are disabled for ppc).
Everything worked fine without them there, just a little bit slower.
It didn't take that much time for me to update it.
I regularly update my svn view and check new changes/updates and rebuild
With that mlpdsp I was surprised that it compiles on gcc at all ;) I
didn't benchmark that code, but I thought that it would be even faster
because there wouldn't be any more jumps inside the loop. Probably if
functions are called in order like 4_3, 3_2, 2_1 and if it looped just
for 2-3 times in each of the functions... Still it shouldn't be slower I
think. Teoretically, with profile guided optimizations compiler should
produce more efficient code by placing functions in proper order
I know I could just disable it, but I didn't disable anything. However,
I forgot to mention in previous post that I had to disable all 3dnow
code, icl refuses any 3dnow instruction :)
./configure with disable-3dnow doesn't work at all. I think that any
3dnow code has to be guarded by #if 3DNOW_ENABLE etc, same for mmx, sse
etc. Otherwise this configure option should be trashed, since it may
make ppl think that resulting binary doesn't contain 3dnow code which
isn't correct at all.
More information about the ffmpeg-devel