[FFmpeg-devel] [flamefest-start] A little something on MMX/SSE intrinsics

Luca Barbato lu_zero
Fri Feb 29 22:16:44 CET 2008


Michael Niedermayer wrote:
>>   - benchmark the results.
> 
> Qualification should _require_ the new code to be faster than the intrinsics.
> The reason being a student failing to write asm code which beats gcc is not
> someone i want optimizing ffmpeg.
> Also anyone can within 5 minutes just run gcc -S over the code and just change
> this to asm (which would result in identical speed on the target cpu) so someone
> submitting slower code really fails IMO.

I fully agree =)

>> SoC :
>>   - split x86 from x86_64, rewrite for speed for x86_64, 
> 
> Iam not in favor of brute force spliting, first there has to be some
> plan/idea for each individual asm(), that is "we could avoid these 3
> load/stores if we had 3 more registers" Also for splited code to be accepted
> it must be impossible to have a equally fast and reasonable clean combined
> function.

Looks nice and sane by me.

>> asm AND 
>> intrinsics, 
> 
> Iam not in favor of this.
> 
> 
>> benchmark the results.
> 
> benchmark on what? one cpu? several? do we have shell accounts for idle
> machines with these cpus?

I can provide accounts to G4, CELL and G2 (efika) by myself and I could 
ask shells on other subtypes I don't own.

>>   - improve the simd coverage for $arch
> 
> Too vague IMHO, we should have some list of functions to optimize at least
> as starpoint, the student could always optimize other things too ...

Sounds fine by me. If you want I could start digging a list.


lu

-- 

Luca Barbato
Gentoo Council Member
Gentoo/linux Gentoo/PPC
http://dev.gentoo.org/~lu_zero





More information about the ffmpeg-devel mailing list