[FFmpeg-devel] [PATCH] ff_scalarproduct_float_sse

Måns Rullgård mans
Wed Jan 20 22:31:14 CET 2010

Michael Niedermayer <michaelni at gmx.at> writes:

> On Wed, Jan 20, 2010 at 02:48:57PM +0000, M?ns Rullg?rd wrote:
>> Michael Niedermayer <michaelni at gmx.at> writes:
>> > On Tue, Jan 19, 2010 at 11:42:40PM -0500, Alex Converse wrote:
>> >> This cause a >50% decrease in SBR decode time.
>> >> 
>> >> For the time being it can help in the other places where
>> >> scalarproduct_float() is used.
>> >> 
>> >> Regards,
>> >> Alex Converse
>> >
>> >>  dsputil_mmx.c    |    5 +++++
>> >>  dsputil_yasm.asm |   25 +++++++++++++++++++++++++
>> >
>> > Would you mind to avoid yasm and use gcc asm instead ?
>> >
>> > I have no problem with yasm as such but gcc asm is more portable and
>> > can be integrated with C code if we ever want that.
>> I have to disagree.  Just look at how many FATE targets broke with
>> your change to h264_loop_filter_strength_mmx2 yesterday.  Several
>> compilers are still failing to build it.
> what we had is called a syntax error, yasm wont do any better
> if you make such errors, though yasm would more consistently fail i guess

There was no syntax error.  A syntax error would have had gcc say
"syntax error", which it didn't.  In fact, it compiled just fine on
x86_64, only failing mysteriously on x86_32.  David then fixed it with
gcc, leaving only icc and suncc failing.

> what we had before was too many complex memory operands, yasm does not
> support that in the first place.

Eh what?  Yasm is an assembler.  You do your own register allocation
there.  That is why it is superior, among other reasons.

> Summary, h264_loop_filter_strength_mmx2() is poorly implemented by having
> loops in C and mixed with asm that expects the compiler to figure out how
> to address complex pointers + - several indexes. Thats not how gcc asm
> should be written IMHO. I dont think i wrote the original function, i just
> fixed a bug in it related to B frames, ideally one should rewrite it with
> all the loops being integrated into asm, this likely would also make it
> faster and closer to how it would look in yasm

So you want to use gcc as an assembler with the world's ugliest

> anyway it should be fixed now

Yes, thanks.

M?ns Rullg?rd
mans at mansr.com

More information about the ffmpeg-devel mailing list