[FFmpeg-devel] [PATCH v4 4/8] swscale/range_convert: fix mpeg ranges in yuv range conversion for non-8-bit pixel formats
Michael Niedermayer
michael at niedermayer.cc
Tue Dec 3 04:35:30 EET 2024
Hi
On Sun, Dec 01, 2024 at 07:20:06PM +0100, Ramiro Polla wrote:
> There is an issue with the constants used in YUV to YUV range conversion,
> where the upper bound is not respected when converting to mpeg range.
>
> With this commit, the constants are calculated at runtime, depending on
> the bit depth. This approach also allows us to more easily understand how
> the constants are derived.
>
> For bit depths <= 14, the number of fixed point bits has been set to 14
> for all conversions, to simplify the code.
> For bit depths > 14, the number of fixed points bits has been raised and
> set to 18, to allow for the conversion to be accurate enough for the mpeg
> range to be respected.
>
> The convert functions now take the conversion constants (coeff and offset)
> as function arguments.
> For bit depths <= 14, coeff is unsigned 16-bit and offset is 32-bit.
> For bit depths > 14, coeff is unsigned 32-bit and offset is 64-bit.
>
> x86_64:
> chrRangeFromJpeg8_1920_c: 2127.4 2125.0 (1.00x)
> chrRangeFromJpeg16_1920_c: 2325.2 2127.2 (1.09x)
> chrRangeToJpeg8_1920_c: 3166.9 3168.7 (1.00x)
> chrRangeToJpeg16_1920_c: 2152.4 3164.8 (0.68x)
> lumRangeFromJpeg8_1920_c: 1263.0 1302.5 (0.97x)
> lumRangeFromJpeg16_1920_c: 1080.5 1299.2 (0.83x)
> lumRangeToJpeg8_1920_c: 1886.8 2112.2 (0.89x)
> lumRangeToJpeg16_1920_c: 1077.0 1906.5 (0.56x)
>
> aarch64 A55:
> chrRangeFromJpeg8_1920_c: 28835.2 28835.6 (1.00x)
> chrRangeFromJpeg16_1920_c: 28839.8 32680.8 (0.88x)
> chrRangeToJpeg8_1920_c: 23074.7 23075.4 (1.00x)
> chrRangeToJpeg16_1920_c: 17318.9 24996.0 (0.69x)
> lumRangeFromJpeg8_1920_c: 15389.7 15384.5 (1.00x)
> lumRangeFromJpeg16_1920_c: 15388.2 17306.7 (0.89x)
> lumRangeToJpeg8_1920_c: 19227.8 19226.6 (1.00x)
> lumRangeToJpeg16_1920_c: 15387.0 21146.3 (0.73x)
>
> aarch64 A76:
> chrRangeFromJpeg8_1920_c: 6324.4 6268.1 (1.01x)
> chrRangeFromJpeg16_1920_c: 6339.9 11521.5 (0.55x)
> chrRangeToJpeg8_1920_c: 9656.0 9612.8 (1.00x)
> chrRangeToJpeg16_1920_c: 6340.4 11651.8 (0.54x)
> lumRangeFromJpeg8_1920_c: 4422.0 4420.8 (1.00x)
> lumRangeFromJpeg16_1920_c: 4420.9 5762.0 (0.77x)
> lumRangeToJpeg8_1920_c: 5949.1 5977.5 (1.00x)
> lumRangeToJpeg16_1920_c: 4446.8 5946.2 (0.75x)
>
> NOTE: all simd optimizations for range_convert have been disabled.
> they will be re-enabled when they are fixed for each architecture.
>
> NOTE2: the same issue still exists in rgb2yuv conversions, which is not
> addressed in this commit.
> ---
> libswscale/aarch64/swscale.c | 5 +
> libswscale/hscale.c | 6 +-
> libswscale/swscale.c | 113 +++++++++--
> libswscale/swscale_internal.h | 26 ++-
> libswscale/x86/swscale.c | 5 +
> tests/checkasm/sw_range_convert.c | 68 ++++++-
> .../fate/filter-alphaextract_alphamerge_rgb | 100 +++++-----
> tests/ref/fate/filter-pixdesc-gray10be | 2 +-
> tests/ref/fate/filter-pixdesc-gray10le | 2 +-
> tests/ref/fate/filter-pixdesc-gray12be | 2 +-
> tests/ref/fate/filter-pixdesc-gray12le | 2 +-
> tests/ref/fate/filter-pixdesc-gray14be | 2 +-
> tests/ref/fate/filter-pixdesc-gray14le | 2 +-
> tests/ref/fate/filter-pixdesc-gray16be | 2 +-
> tests/ref/fate/filter-pixdesc-gray16le | 2 +-
> tests/ref/fate/filter-pixdesc-gray9be | 2 +-
> tests/ref/fate/filter-pixdesc-gray9le | 2 +-
> tests/ref/fate/filter-pixdesc-ya16be | 2 +-
> tests/ref/fate/filter-pixdesc-ya16le | 2 +-
> tests/ref/fate/filter-pixdesc-yuvj411p | 2 +-
> tests/ref/fate/filter-pixdesc-yuvj420p | 2 +-
> tests/ref/fate/filter-pixdesc-yuvj422p | 2 +-
> tests/ref/fate/filter-pixdesc-yuvj440p | 2 +-
> tests/ref/fate/filter-pixdesc-yuvj444p | 2 +-
> tests/ref/fate/filter-pixfmts-copy | 34 ++--
> tests/ref/fate/filter-pixfmts-crop | 34 ++--
> tests/ref/fate/filter-pixfmts-field | 34 ++--
> tests/ref/fate/filter-pixfmts-fieldorder | 30 +--
> tests/ref/fate/filter-pixfmts-hflip | 34 ++--
> tests/ref/fate/filter-pixfmts-il | 34 ++--
> tests/ref/fate/filter-pixfmts-lut | 18 +-
> tests/ref/fate/filter-pixfmts-null | 34 ++--
> tests/ref/fate/filter-pixfmts-pad | 22 +--
> tests/ref/fate/filter-pixfmts-pullup | 10 +-
> tests/ref/fate/filter-pixfmts-rotate | 4 +-
> tests/ref/fate/filter-pixfmts-scale | 34 ++--
> tests/ref/fate/filter-pixfmts-swapuv | 10 +-
> .../ref/fate/filter-pixfmts-tinterlace_cvlpf | 8 +-
> .../ref/fate/filter-pixfmts-tinterlace_merge | 8 +-
> tests/ref/fate/filter-pixfmts-tinterlace_pad | 8 +-
> tests/ref/fate/filter-pixfmts-tinterlace_vlpf | 8 +-
> tests/ref/fate/filter-pixfmts-transpose | 28 +--
> tests/ref/fate/filter-pixfmts-vflip | 34 ++--
> tests/ref/fate/fitsenc-gray | 2 +-
> tests/ref/fate/fitsenc-gray16be | 10 +-
> tests/ref/fate/gifenc-gray | 186 +++++++++---------
> tests/ref/fate/idroq-video-encode | 2 +-
> tests/ref/fate/jpg-icc | 8 +-
> tests/ref/fate/sws-yuv-colorspace | 2 +-
> tests/ref/fate/sws-yuv-range | 2 +-
> tests/ref/fate/vvc-conformance-SCALING_A_1 | 128 ++++++------
> tests/ref/lavf/gray16be.fits | 4 +-
> tests/ref/lavf/gray16be.pam | 4 +-
> tests/ref/lavf/gray16be.png | 6 +-
> tests/ref/lavf/jpg | 6 +-
> tests/ref/lavf/smjpeg | 6 +-
> tests/ref/pixfmt/gbrp-gray | 2 +-
> tests/ref/pixfmt/gbrp-gray10be | 2 +-
> tests/ref/pixfmt/gbrp-gray10le | 2 +-
> tests/ref/pixfmt/gbrp-gray12be | 2 +-
> tests/ref/pixfmt/gbrp-gray12le | 2 +-
> tests/ref/pixfmt/gbrp-gray16be | 2 +-
> tests/ref/pixfmt/gbrp-gray16le | 2 +-
> tests/ref/pixfmt/gbrp-yuvj420p | 2 +-
> tests/ref/pixfmt/gbrp-yuvj422p | 2 +-
> tests/ref/pixfmt/gbrp-yuvj440p | 2 +-
> tests/ref/pixfmt/gbrp-yuvj444p | 2 +-
> tests/ref/pixfmt/gbrp10-gray | 2 +-
> tests/ref/pixfmt/gbrp10-gray10be | 2 +-
> tests/ref/pixfmt/gbrp10-gray10le | 2 +-
> tests/ref/pixfmt/gbrp10-gray12be | 2 +-
> tests/ref/pixfmt/gbrp10-gray12le | 2 +-
> tests/ref/pixfmt/gbrp10-gray16be | 2 +-
> tests/ref/pixfmt/gbrp10-gray16le | 2 +-
> tests/ref/pixfmt/gbrp10-yuvj420p | 2 +-
> tests/ref/pixfmt/gbrp10-yuvj422p | 2 +-
> tests/ref/pixfmt/gbrp10-yuvj440p | 2 +-
> tests/ref/pixfmt/gbrp10-yuvj444p | 2 +-
> tests/ref/pixfmt/gbrp12-gray | 2 +-
> tests/ref/pixfmt/gbrp12-gray10be | 2 +-
> tests/ref/pixfmt/gbrp12-gray10le | 2 +-
> tests/ref/pixfmt/gbrp12-gray12be | 2 +-
> tests/ref/pixfmt/gbrp12-gray12le | 2 +-
> tests/ref/pixfmt/gbrp12-gray16be | 2 +-
> tests/ref/pixfmt/gbrp12-gray16le | 2 +-
> tests/ref/pixfmt/gbrp12-yuvj420p | 2 +-
> tests/ref/pixfmt/gbrp12-yuvj422p | 2 +-
> tests/ref/pixfmt/gbrp12-yuvj440p | 2 +-
> tests/ref/pixfmt/gbrp12-yuvj444p | 2 +-
> tests/ref/pixfmt/gbrp16-gray16be | 2 +-
> tests/ref/pixfmt/gbrp16-gray16le | 2 +-
> tests/ref/pixfmt/rgb24-gray | 2 +-
> tests/ref/pixfmt/rgb24-gray10be | 2 +-
> tests/ref/pixfmt/rgb24-gray10le | 2 +-
> tests/ref/pixfmt/rgb24-gray12be | 2 +-
> tests/ref/pixfmt/rgb24-gray12le | 2 +-
> tests/ref/pixfmt/rgb24-gray16be | 2 +-
> tests/ref/pixfmt/rgb24-gray16le | 2 +-
> tests/ref/pixfmt/rgb24-yuvj420p | 2 +-
> tests/ref/pixfmt/rgb24-yuvj422p | 2 +-
> tests/ref/pixfmt/rgb24-yuvj440p | 2 +-
> tests/ref/pixfmt/rgb24-yuvj444p | 2 +-
> tests/ref/pixfmt/rgb48-gray | 2 +-
> tests/ref/pixfmt/rgb48-gray10be | 2 +-
> tests/ref/pixfmt/rgb48-gray10le | 2 +-
> tests/ref/pixfmt/rgb48-gray12be | 2 +-
> tests/ref/pixfmt/rgb48-gray12le | 2 +-
> tests/ref/pixfmt/rgb48-gray16be | 2 +-
> tests/ref/pixfmt/rgb48-gray16le | 2 +-
> tests/ref/pixfmt/rgb48-yuvj420p | 2 +-
> tests/ref/pixfmt/rgb48-yuvj422p | 2 +-
> tests/ref/pixfmt/rgb48-yuvj440p | 2 +-
> tests/ref/pixfmt/rgb48-yuvj444p | 2 +-
> tests/ref/pixfmt/yuv444p-gray10be | 2 +-
> tests/ref/pixfmt/yuv444p-gray10le | 2 +-
> tests/ref/pixfmt/yuv444p-gray12be | 2 +-
> tests/ref/pixfmt/yuv444p-gray12le | 2 +-
> tests/ref/pixfmt/yuv444p-gray16be | 2 +-
> tests/ref/pixfmt/yuv444p-gray16le | 2 +-
> tests/ref/pixfmt/yuv444p-yuvj420p | 2 +-
> tests/ref/pixfmt/yuv444p-yuvj422p | 2 +-
> tests/ref/pixfmt/yuv444p-yuvj440p | 2 +-
> tests/ref/pixfmt/yuv444p10-gray | 2 +-
> tests/ref/pixfmt/yuv444p10-gray10be | 2 +-
> tests/ref/pixfmt/yuv444p10-gray10le | 2 +-
> tests/ref/pixfmt/yuv444p10-gray12be | 2 +-
> tests/ref/pixfmt/yuv444p10-gray12le | 2 +-
> tests/ref/pixfmt/yuv444p10-gray16be | 2 +-
> tests/ref/pixfmt/yuv444p10-gray16le | 2 +-
> tests/ref/pixfmt/yuv444p10-yuvj420p | 2 +-
> tests/ref/pixfmt/yuv444p10-yuvj422p | 2 +-
> tests/ref/pixfmt/yuv444p10-yuvj440p | 2 +-
> tests/ref/pixfmt/yuv444p10-yuvj444p | 2 +-
> tests/ref/pixfmt/yuv444p12-gray | 2 +-
> tests/ref/pixfmt/yuv444p12-gray10be | 2 +-
> tests/ref/pixfmt/yuv444p12-gray10le | 2 +-
> tests/ref/pixfmt/yuv444p12-gray12be | 2 +-
> tests/ref/pixfmt/yuv444p12-gray12le | 2 +-
> tests/ref/pixfmt/yuv444p12-gray16be | 2 +-
> tests/ref/pixfmt/yuv444p12-gray16le | 2 +-
> tests/ref/pixfmt/yuv444p12-yuvj420p | 2 +-
> tests/ref/pixfmt/yuv444p12-yuvj422p | 2 +-
> tests/ref/pixfmt/yuv444p12-yuvj440p | 2 +-
> tests/ref/pixfmt/yuv444p12-yuvj444p | 2 +-
> tests/ref/pixfmt/yuv444p16-gray16be | 2 +-
> tests/ref/pixfmt/yuv444p16-gray16le | 2 +-
> tests/ref/pixfmt/yuvj420p | 2 +-
> tests/ref/pixfmt/yuvj422p | 2 +-
> tests/ref/pixfmt/yuvj440p | 2 +-
> tests/ref/pixfmt/yuvj444p | 2 +-
> tests/ref/seek/lavf-jpg | 8 +-
> tests/ref/seek/vsynth_lena-mjpeg | 40 ++--
> tests/ref/seek/vsynth_lena-roqvideo | 2 +-
> tests/ref/vsynth/vsynth1-amv | 8 +-
> tests/ref/vsynth/vsynth1-mjpeg | 6 +-
> tests/ref/vsynth/vsynth1-mjpeg-422 | 6 +-
> tests/ref/vsynth/vsynth1-mjpeg-444 | 6 +-
> tests/ref/vsynth/vsynth1-mjpeg-huffman | 6 +-
> tests/ref/vsynth/vsynth1-mjpeg-trell | 8 +-
> tests/ref/vsynth/vsynth1-mjpeg-trell-huffman | 8 +-
> tests/ref/vsynth/vsynth1-roqvideo | 8 +-
> tests/ref/vsynth/vsynth2-amv | 6 +-
> tests/ref/vsynth/vsynth2-mjpeg | 6 +-
> tests/ref/vsynth/vsynth2-mjpeg-422 | 6 +-
> tests/ref/vsynth/vsynth2-mjpeg-444 | 6 +-
> tests/ref/vsynth/vsynth2-mjpeg-huffman | 6 +-
> tests/ref/vsynth/vsynth2-mjpeg-trell | 8 +-
> tests/ref/vsynth/vsynth2-mjpeg-trell-huffman | 8 +-
> tests/ref/vsynth/vsynth2-roqvideo | 8 +-
> tests/ref/vsynth/vsynth3-amv | 8 +-
> tests/ref/vsynth/vsynth3-mjpeg | 8 +-
> tests/ref/vsynth/vsynth3-mjpeg-422 | 8 +-
> tests/ref/vsynth/vsynth3-mjpeg-444 | 6 +-
> tests/ref/vsynth/vsynth3-mjpeg-huffman | 8 +-
> tests/ref/vsynth/vsynth3-mjpeg-trell | 6 +-
> tests/ref/vsynth/vsynth3-mjpeg-trell-huffman | 6 +-
> tests/ref/vsynth/vsynth_lena-amv | 6 +-
> tests/ref/vsynth/vsynth_lena-mjpeg | 8 +-
> tests/ref/vsynth/vsynth_lena-mjpeg-422 | 6 +-
> tests/ref/vsynth/vsynth_lena-mjpeg-444 | 6 +-
> tests/ref/vsynth/vsynth_lena-mjpeg-huffman | 8 +-
> tests/ref/vsynth/vsynth_lena-mjpeg-trell | 8 +-
> .../vsynth/vsynth_lena-mjpeg-trell-huffman | 8 +-
> tests/ref/vsynth/vsynth_lena-roqvideo | 8 +-
> 184 files changed, 880 insertions(+), 725 deletions(-)
should be ok if tested and output values are ok
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Modern terrorism, a quick summary: Need oil, start war with country that
has oil, kill hundread thousand in war. Let country fall into chaos,
be surprised about raise of fundamantalists. Drop more bombs, kill more
people, be surprised about them taking revenge and drop even more bombs
and strip your own citizens of their rights and freedoms. to be continued
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241203/5c8136eb/attachment.sig>
More information about the ffmpeg-devel
mailing list