[FFmpeg-devel] [PATCH v2 6/9] swscale/arm/yuv2rgb: macro-ify

Benoit Fouet benoit.fouet at free.fr
Thu Mar 31 10:48:10 CEST 2016


Hi,

(sorry for the first mail, fuzzy fingers...)

On 28/03/2016 21:19, Matthieu Bouron wrote:
> ---
>   libswscale/arm/yuv2rgb_neon.S | 137 ++++++++++++++++++------------------------
>   1 file changed, 60 insertions(+), 77 deletions(-)
>
> diff --git a/libswscale/arm/yuv2rgb_neon.S b/libswscale/arm/yuv2rgb_neon.S
> index ef7b0a6..e1b68c1 100644
> --- a/libswscale/arm/yuv2rgb_neon.S
> +++ b/libswscale/arm/yuv2rgb_neon.S
> @@ -64,7 +64,7 @@
>       vmov.u8             \a2, #255
>   .endm
>   
> -.macro compute_16px dst y0 y1 ofmt
> +.macro compute dst y0 y1 ofmt
>       vmovl.u8            q14, \y0                                       @ 8px of y
>       vmovl.u8            q15, \y1                                       @ 8px of y
>   
> @@ -99,23 +99,23 @@
>   
>   .endm
>   
> -.macro process_1l_16px ofmt
> +.macro process_1l ofmt
>       compute_premult     d28, d29, d30, d31
>       vld1.8              {q7}, [r4]!
> -    compute_16px        r2, d14, d15, \ofmt
> +    compute             r2, d14, d15, \ofmt
>   .endm
>   
> -.macro process_2l_16px ofmt
> +.macro process_2l ofmt
>       compute_premult     d28, d29, d30, d31
>   
>       vld1.8              {q7}, [r4]!                                    @ first line of luma
> -    compute_16px        r2, d14, d15, \ofmt
> +    compute             r2, d14, d15, \ofmt
>   
>       vld1.8              {q7}, [r12]!                                   @ second line of luma
> -    compute_16px        r11, d14, d15, \ofmt
> +    compute             r11, d14, d15, \ofmt
>   .endm
>   

This renaming could be split

[...]

> @@ -232,68 +204,79 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1
>       vld1.8              d3, [r10]!                                     @ d3: chroma blue line
>       vsubl.u8            q14, d2, d10                                   @ q14 = U - 128
>       vsubl.u8            q15, d3, d10                                   @ q15 = V - 128
> +.endm
>   
> -    process_2l_16px \ofmt
> -.endif
> -
> -.ifc \ifmt,yuv422p
> +.macro load_chroma_yuv422p
>       pld [r10, #64*3]
>   
>       vld1.8              d2, [r6]!                                      @ d2: chroma red line
>       vld1.8              d3, [r10]!                                     @ d3: chroma blue line
>       vsubl.u8            q14, d2, d10                                   @ q14 = U - 128
>       vsubl.u8            q15, d3, d10                                   @ q15 = V - 128
> +.endm
>   
> -    process_1l_16px \ofmt
> -.endif
> -
> -    subs                r8, r8, #16                                    @ width -= 16
> -    bgt                 2b
> -
> -    add                 r2, r2, r3                                     @ dst   += padding
> -    add                 r4, r4, r5                                     @ srcY  += paddingY
> -
> -.ifc \ifmt,nv12
> +.macro increment_nv12

How about increment_and test_nv12? Same for the other ones.
(I'm not happy with the name I found, but am trying to come up with a 
solution to have a more explicit naming)

-- 
Ben



More information about the ffmpeg-devel mailing list