[FFmpeg-devel] [PATCH 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests
Martin Storsjö
martin at martin.st
Tue Mar 29 16:13:59 EEST 2022
On Fri, 25 Mar 2022, Ben Avison wrote:
> Disable ff_add_pixels_clamped_arm, which was found to fail the test. As this
> is normally only used for Arms prior to Armv6 (ARM11) it seems quite unlikely
> that anyone is still using this, so I haven't put in the effort to debug it.
I had a look at this function, and I see that the overflow checks are
using
tst r6, #0x100
to see whether the addition overflowed (either above or below). However,
if block[] was e.g. 0x200, it's possible to overflow without setting this
bit at all.
If it would be the case that the valid range of block[] values would be
e.g. [-255,255], then this kind of overflow checking would work though.
(As there exists assembly for armv6, then this function probably hasn't
been used much in modern times, so this doesn't say much about what values
actually are used here.)
Secondly, the clamping seems to be done with
movne r6, r5, lsr #24
However that should use asr, not lsr, I think, to get proper clamping in
both ends?
Thirdly - the added test also occasionally fails for the other existing
functions (armv6, neon) and the newly added aarch64 neon version. If you
have e.g. src[] = 32767, dst[] = 255, then the widening 8->16 addition
will overflow, as there's no operation that both widens and clamps at the
same time.
I think this is reason to limit the range of src[] at least somewhat in
the test, since I don't think the full 16 bit signed range actually is
relevant here.
// Martin
More information about the ffmpeg-devel
mailing list