[FFmpeg-devel] [PATCH 2/2] swscale/x86/input: add AVX2 optimized uyvytoyuv422

Thu Jun 6 10:01:24 EEST 2024

Le jeu. 6 juin 2024 à 08:11, Rémi Denis-Courmont <remi at remlab.net> a écrit :
> >James Almer:
> >> uyvytoyuv422_c: 23991.8
> >> uyvytoyuv422_sse2: 2817.8
> >> uyvytoyuv422_avx: 2819.3
> >
> >Why don't you nuke the avx version in a follow-up patch?
>
> Same problem with the RGBA stuff as well. Are the AVX functions expected to be faster than SSE2 on processors *without* AVX2?

Something frequent in this type of questions is that people are using
numbers from a CPU that has had 10 years of arch improvements (and
probably a doubling in throughput for any instruction set) over one
that supported at most AVX. The presence of an AVX function (whose
benefit is only 3-operand instructions, so admittedly small) would
ideally only be benchmarked on that kind of CPUs.

Case in point, at that time, even x264 introduced avx versions, so
there was a time and CPU generations where yes, it was faster:
https://code.videolan.org/search?search=INIT_XMM%20avx&nav_source=navbar&project_id=536&group_id=9&search_code=true&repository_ref=master
https://code.videolan.org/videolan/x264/-/commit/abc2283e9abc6254744bf6dd148ac25433cdf80e

But I understand the point is that any type of maintenance for a minor
improvement to few CPUs, which are maybe 1% of a userbase, is not
appealing.