[FFmpeg-devel] [PATCH v2 2/4] swscale/x86: add sse4 {lum, chr}ConvertRange

Ramiro Polla ramiro.polla at gmail.com
Wed Jun 12 17:54:49 EEST 2024


Hi,

On Tue, Jun 11, 2024 at 8:42 PM James Almer <jamrial at gmail.com> wrote:
>
> On 6/11/2024 3:26 PM, Michael Niedermayer wrote:
> > On Tue, Jun 11, 2024 at 02:28:56PM +0200, Ramiro Polla wrote:
> >> chrRangeFromJpeg_8_c: 28.7
> >> chrRangeFromJpeg_8_sse4: 16.2
> >> chrRangeFromJpeg_24_c: 152.7
> >> chrRangeFromJpeg_24_sse4: 29.7
> >> chrRangeFromJpeg_128_c: 366.5
> >> chrRangeFromJpeg_128_sse4: 233.0
> >> chrRangeFromJpeg_144_c: 408.0
> >> chrRangeFromJpeg_144_sse4: 182.5
> >> chrRangeFromJpeg_256_c: 698.7
> >> chrRangeFromJpeg_256_sse4: 325.5
> >> chrRangeFromJpeg_512_c: 1348.7
> >> chrRangeFromJpeg_512_sse4: 660.2
> >> chrRangeToJpeg_8_c: 37.7
> >> chrRangeToJpeg_8_sse4: 16.2
> >> chrRangeToJpeg_24_c: 115.7
> >> chrRangeToJpeg_24_sse4: 36.2
> >> chrRangeToJpeg_128_c: 631.2
> >> chrRangeToJpeg_128_sse4: 163.7
> >> chrRangeToJpeg_144_c: 710.7
> >> chrRangeToJpeg_144_sse4: 183.0
> >> chrRangeToJpeg_256_c: 1253.0
> >> chrRangeToJpeg_256_sse4: 343.5
> >> chrRangeToJpeg_512_c: 2491.2
> >> chrRangeToJpeg_512_sse4: 654.2
> >> lumRangeFromJpeg_8_c: 11.7
> >> lumRangeFromJpeg_8_sse4: 10.5
> >> lumRangeFromJpeg_24_c: 38.5
> >> lumRangeFromJpeg_24_sse4: 19.0
> >> lumRangeFromJpeg_128_c: 237.5
> >> lumRangeFromJpeg_128_sse4: 79.2
> >> lumRangeFromJpeg_144_c: 255.7
> >> lumRangeFromJpeg_144_sse4: 90.5
> >> lumRangeFromJpeg_256_c: 441.5
> >> lumRangeFromJpeg_256_sse4: 161.7
> >> lumRangeFromJpeg_512_c: 879.0
> >> lumRangeFromJpeg_512_sse4: 333.2
> >> lumRangeToJpeg_8_c: 20.0
> >> lumRangeToJpeg_8_sse4: 11.7
> >> lumRangeToJpeg_24_c: 61.5
> >> lumRangeToJpeg_24_sse4: 17.7
> >> lumRangeToJpeg_128_c: 357.5
> >> lumRangeToJpeg_128_sse4: 80.0
> >> lumRangeToJpeg_144_c: 371.5
> >> lumRangeToJpeg_144_sse4: 93.2
> >> lumRangeToJpeg_256_c: 651.5
> >> lumRangeToJpeg_256_sse4: 164.5
> >> lumRangeToJpeg_512_c: 1279.0
> >> lumRangeToJpeg_512_sse4: 333.7
> >> ---
> >>   libswscale/swscale_internal.h    |   1 +
> >>   libswscale/utils.c               |   2 +
> >>   libswscale/x86/Makefile          |   1 +
> >>   libswscale/x86/range_convert.asm | 130 +++++++++++++++++++++++++++++++
> >>   libswscale/x86/swscale.c         |  36 +++++++++
> >>   5 files changed, 170 insertions(+)
> >>   create mode 100644 libswscale/x86/range_convert.asm
> >
> > breaks x86-32 build
> >
> > LD    ffmpeg_g
> > /usr/lib/gcc-cross/i686-linux-gnu/7/../../../../i686-linux-gnu/bin/ld: libswscale/libswscale.a(utils.o): in function `sws_setColorspaceDetails':
> > ffmpeg/linux32/src/libswscale/utils.c:1086: undefined reference to `ff_sws_init_range_convert_x86'
> > collect2: error: ld returned 1 exit status
> > make: *** [Makefile:139: ffmpeg_g] Error 1
> >
> > thx
>
> The functions are wrapped in ARCH_X86_64 checks for seemingly no reason,
> so they should be removed in the next iteration.

Fixed.

James walked me through on IRC to optimize and improve the functions
in a way that they work both with sse2 and avx2. New patch attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-swscale-x86-add-sse2-and-avx2-lum-chr-ConvertRange.patch
Type: text/x-patch
Size: 11420 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240612/15ed2d09/attachment.bin>


More information about the ffmpeg-devel mailing list