[FFmpeg-devel] [PATCHES] lpc_mmx: merge some asm blocks and add xmm registers to clobber list

Ramiro Polla ramiro.polla
Mon Nov 1 00:37:51 CET 2010


On Sun, Oct 31, 2010 at 8:55 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Sat, Oct 30, 2010 at 05:46:58PM -0200, Ramiro Polla wrote:
>> $subj, split in 2 patches
>
>> ?lpc_mmx.c | ? 19 ++++++++++---------
>> ?1 file changed, 10 insertions(+), 9 deletions(-)
>> 267c781586fd0713785433848de9168783185e60 ?0007-lpc_mmx-merge-some-asm-blocks.patch
>> From 33992c433a04fe7d9228aafcf86b9d1cd42032cf Mon Sep 17 00:00:00 2001
>> From: Ramiro Polla <ramiro.polla at gmail.com>
>> Date: Sat, 30 Oct 2010 17:34:14 -0200
>> Subject: [PATCH 7/9] lpc_mmx: merge some asm blocks
>>
>> These blocks depended on the compiler keeping xmm registers untouched between
>> them.
>> ---
>> ?libavcodec/x86/lpc_mmx.c | ? 19 ++++++++++---------
>> ?1 files changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/libavcodec/x86/lpc_mmx.c b/libavcodec/x86/lpc_mmx.c
>> index 2ef5fa6..d7f188e 100644
>> --- a/libavcodec/x86/lpc_mmx.c
>> +++ b/libavcodec/x86/lpc_mmx.c
>> @@ -29,16 +29,15 @@ static void apply_welch_window_sse2(const int32_t *data, int len, double *w_data
>> ? ? ?x86_reg i = -n2*sizeof(int32_t);
>> ? ? ?x86_reg j = ?n2*sizeof(int32_t);
>> ? ? ?__asm__ volatile(
>> - ? ? ? ?"movsd ? %0, ? ? %%xmm7 ? ? ? ? ? ? ? ?\n\t"
>> + ? ? ? ?"movsd ? %4, ? ? %%xmm7 ? ? ? ? ? ? ? ?\n\t"
>> ? ? ? ? ?"movapd ?"MANGLE(ff_pd_1)", %%xmm6 ? ? \n\t"
>> ? ? ? ? ?"movapd ?"MANGLE(ff_pd_2)", %%xmm5 ? ? \n\t"
>> ? ? ? ? ?"movlhps %%xmm7, %%xmm7 ? ? ? ? ? ? ? ?\n\t"
>> ? ? ? ? ?"subpd ? %%xmm5, %%xmm7 ? ? ? ? ? ? ? ?\n\t"
>> ? ? ? ? ?"addsd ? %%xmm6, %%xmm7 ? ? ? ? ? ? ? ?\n\t"
>> - ? ? ? ?::"m"(c)
>> - ? ?);
>> + ? ? ? ?"bt ? ? ?$1, ? ? %5 ? ? ? ? ? ? ? ? ? ?\n\t"
>
> most likely test is faster at least on some cpus
>
> anyway, iam ik with whatever is fastest

Applied with test and jz.



More information about the ffmpeg-devel mailing list