[FFmpeg-devel] [PATCH] swscale/arm: add ff_nv{12, 21}_to_{argb, rgba, abgr, bgra}_neon

Michael Niedermayer michaelni at gmx.at
Thu Nov 19 16:50:54 CET 2015

On Thu, Nov 19, 2015 at 11:48:53AM +0100, Clément Bœsch wrote:
> From: Matthieu Bouron <matthieu.bouron at stupeflix.com>
> Signed-off-by: Matthieu Bouron <matthieu.bouron at stupeflix.com>
> Signed-off-by: Clément Bœsch <clement at stupeflix.com>
> ---
> The function takes about 29ms with a 1080p source (testsrc2) on a
> cortex-a8. Though, 16ms (more than half the time!) is spend in the vst2
> call. Any suggestion on how to speed up this?
> Also, the reference code seems to cause some kind of ringing, while our
> ASM doesn't:
>   http://b.pkh.me/nv12-rgba-ref.png
>   http://b.pkh.me/nv12-rgba-neon.png

what did you test exactly here ?
but there are several codepathes for rgb output, one uses LUTs and
not all use full resolution chroma

> Last, we noticed that the y_offset is scaled to 1<<9 for some reason we
> couldn't figure out. Hopefully we're doing it correctly here.

> +.macro compute_half_line dst half_y ofmt
> +    vmovl.u8            q7, \half_y                                    @ 8px of Y
> +    vdup.16             q5, r9
> +    vsub.s16            q7, q5
> +    vmull.s16           q1, d14, d0                                    @ q1 = (srcY - y_offset) * y_coeff (left)
> +    vmull.s16           q2, d15, d0                                    @ q2 = (srcY - y_offset) * y_coeff (right)

if you do something like (srcY) * y_coeff - y_offset2
then you could keep a bit more precission in the requested brightness
OTOH maybe you want to be bitexact to some existing codepath

either way, your patch passes fate with arm qemu here so i have
no objections if you also tested it and it works
but maybe others have more comments about the asm ...

Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151119/22530af2/attachment.sig>

More information about the ffmpeg-devel mailing list