[FFmpeg-devel] [PATCH] avcodec/scpr: optimize shift loop.
Michael Niedermayer
michael at niedermayer.cc
Sun Sep 10 17:28:15 EEST 2017
On Sat, Sep 09, 2017 at 08:02:04PM -0300, James Almer wrote:
> On 9/9/2017 7:47 PM, Michael Niedermayer wrote:
> > On Sat, Sep 09, 2017 at 04:37:52PM -0500, Brian Matherly wrote:
> >>
> >> On 9/9/2017 1:27 PM, Michael Niedermayer wrote:
> >>> + // If the image is sufficiently aligned, compute 8 samples at once
> >>> + if (!(((uintptr_t)dst) & 7)) {
> >>> + uint64_t *dst64 = (uint64_t *)dst;
> >>> + int w = avctx->width>>1;
> >>> + for (x = 0; x < w; x++) {
> >>> + dst64[x] = (dst64[x] << 3) & 0xFCFCFCFCFCFCFCFCULL;
> >>> + }
> >>> + x *= 8;
> >>> + } else
> >>> + x = 0;
> >>> + for (; x < avctx->width * 4; x++) {
> >>> dst[x] = dst[x] << 3;
> >>> }
> >>
> >> Forgive me if I'm not understanding the code correctly, but couldn't
> >> you always apply the optimization if you align the first (up to) 7
> >> samples?
> >
> > yes, thats possible, it would be optimizing a case that probably
> > never occurs in practice though.
> >
> > If people want, i can add code to handle misaligned cases ?
>
> No, frame->data[0] should always be sufficiently aligned.
> I hadn't even looked what dst pointed to, hence my original comment
> about this change potentially not have an effect in all cases.
>
> It's ok as is, no need to make it anymore complex.
will apply
thanks
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
It is dangerous to be right in matters on which the established authorities
are wrong. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170910/8ce699e0/attachment.sig>
More information about the ffmpeg-devel
mailing list