[FFmpeg-devel] [RFC] SSE3/4 implementation of flac_encode_residual_lpc
Jason Garrett-Glaser
darkshikari
Sat May 23 13:00:59 CEST 2009
On Fri, May 22, 2009 at 11:40 PM, Bobby Bingham <uhmmmm at gmail.com> wrote:
> On Sun, 3 May 2009 21:21:19 -0700
> Jason Garrett-Glaser <darkshikari at gmail.com> wrote:
>> > "phaddd ? ? %%xmm1, %%xmm0 ? ? ? ? ?\n\t"
>> > "phaddd ? ? %%xmm3, %%xmm2 ? ? ? ? ?\n\t"
>> > "phaddd ? ? %%xmm2, %%xmm0 ? ? ? ? ?\n\t" ? // xmm0 = [p0, p1, p2,
>> > p3]
>>
>> Did you not find a better way of doing this without PHADD, given how
>> slow it is?
>
> The best I've come up with so far is this, but I can't compare the
> speed:
>
> "movdqa ? ? %%xmm0, %%xmm4 ? ? ? ? ?\n\t"
> "movdqa ? ? %%xmm2, %%xmm5 ? ? ? ? ?\n\t"
> "punpckldq ?%%xmm1, %%xmm0 ? ? ? ? ?\n\t"
> "punpckhdq ?%%xmm1, %%xmm4 ? ? ? ? ?\n\t"
> "punpckldq ?%%xmm3, %%xmm2 ? ? ? ? ?\n\t"
> "punpckhdq ?%%xmm3, %%xmm5 ? ? ? ? ?\n\t"
> "paddd ? ? ?%%xmm4, %%xmm0 ? ? ? ? ?\n\t"
> "paddd ? ? ?%%xmm5, %%xmm2 ? ? ? ? ?\n\t"
> "movdqa ? ? %%xmm0, %%xmm1 ? ? ? ? ?\n\t"
> "punpcklqdq %%xmm2, %%xmm0 ? ? ? ? ?\n\t"
> "punpckhqdq %%xmm2, %%xmm1 ? ? ? ? ?\n\t"
> "paddd ? ? ?%%xmm1, %%xmm0 ? ? ? ? ?\n\t"
You really should not be writing assembly without a system to test it on.
Various people have shell accounts they can loan you--for example,
checkers on #x264 can give out shell accounts on Penryn-based Linux
systems.
Dark Shikari
More information about the ffmpeg-devel
mailing list