[FFmpeg-devel] [PATCH] avcodec/scpr: optimize shift loop.

James Almer jamrial at gmail.com
Sun Sep 10 02:02:04 EEST 2017


On 9/9/2017 7:47 PM, Michael Niedermayer wrote:
> On Sat, Sep 09, 2017 at 04:37:52PM -0500, Brian Matherly wrote:
>>
>> On 9/9/2017 1:27 PM, Michael Niedermayer wrote:
>>> +            // If the image is sufficiently aligned, compute 8 samples at once
>>> +            if (!(((uintptr_t)dst) & 7)) {
>>> +                uint64_t *dst64 = (uint64_t *)dst;
>>> +                int w = avctx->width>>1;
>>> +                for (x = 0; x < w; x++) {
>>> +                    dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL;
>>> +                }
>>> +                x *= 8;
>>> +            } else
>>> +                x = 0;
>>> +            for (; x < avctx->width * 4; x++) {
>>>                  dst[x] = dst[x] << 3;
>>>              }
>>
>> Forgive me if I'm not understanding the code correctly, but couldn't
>> you always apply the optimization if you align the first (up to) 7
>> samples?
> 
> yes, thats possible, it would be optimizing a case that probably
> never occurs in practice though.
> 
> If people want, i can add code to handle misaligned cases ?

No, frame->data[0] should always be sufficiently aligned.
I hadn't even looked what dst pointed to, hence my original comment
about this change potentially not have an effect in all cases.

It's ok as is, no need to make it anymore complex.

> 
> thx
> 
> [...]
> 
> 
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 



More information about the ffmpeg-devel mailing list