[FFmpeg-devel] [PATCH] Altivec version of h264_idct_add
Sun Jun 3 12:27:27 CEST 2007
David Conrad wrote:
> On Jun 3, 2007, at 12:38 AM, Luca Barbato wrote:
>> David Conrad wrote:
>>> On Jun 2, 2007, at 10:15 PM, Luca Barbato wrote:
>>>> Loren Merritt wrote:
>>>>> The switch could be changed to a table if it matters.
>>>> In theory vec_ste is all we need here sadly, I cannot manage to get it
>>>> working right for the unaligned cases.
>>> I've never really looked at vec_ste before today, but it seems that
>>> vec_ste will always write the first element of the vector to the
>>> rounded-down 16-byte address, and to store to an unaligned address you
>>> have to move the data in the vector and store that element. The attached
>>> patch does this with a permute and uses it instead of the switch. It
>>> requires an additional 4 permutes and constant vector the aligned case,
>>> but it seems to be a bit faster overall on my G4.
>> vec_splat() should be enough (another perm spared)
> Like so?
Right =), does it help a bit?
More information about the ffmpeg-devel