[FFmpeg-devel] [PATCH] SSE3/4 implementation of flac_encode_residual_lpc
Thu Jun 18 13:51:00 CEST 2009
On Sat, May 30, 2009 at 09:30:28PM +0000, Loren Merritt wrote:
> On Sat, 30 May 2009, Bobby Bingham wrote:
>> On Fri, 29 May 2009, Loren Merritt wrote:
>>> For the remainder, this logic should be doable
>>> with just 1 paddd and 1 por per vector. Merge several vectors before
>> I'm afraid I don't quite see what you mean by using 1 paddd and 1 por.
>> The attached patch does have a slight improvement in this piece of
>> code, but I doubt it's what you meant.
> The C version is:
> (unsigned)(x+0x8000) >= 0x10000
> And to merge several entries before the branch:
> (unsigned)((x+0x8000) | (x+0x8000) | ...) >= 0x10000
> Or since sse doesn't have an uint32 compare:
> (((x+0x8000) | (x+0x8000) | ...) >> 16) != 0
> This won't be much if any faster than yours when testing one vector at a
whats the status of this patch?
waiting for changes?
ok to commit?
want me to review it?
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel