[FFmpeg-devel] [PATCH 7/9] sbrdsp: unroll and use integer operations
Christophe Gisquet
christophe.gisquet at gmail.com
Fri Apr 5 12:18:25 CEST 2013
Hi,
2013/4/5 Michael Niedermayer <michaelni at gmx.at>:
> casting *float to *int32 and dereferencing is a strict aliassing
> violation,
Cue in "2 wrong do not make a right", "cargo cult" etc, but repasting
some I wrote some time ago in another project:
"""
Mainly for the record, because it does not really matter, but the
assumption of IEEE754-format floats is pretty strong, see for instance
that code in dcadec:
for (i = 0; i < sb_act; i++) {
unsigned sign = (i - 1) & 2;
uint32_t v = AV_RN32A(&samples_in[i][
subindex]) ^ sign << 30;
AV_WN32A(&s->raXin[i], v);
}
Other suspicious code (but I haven't investigated further):
aacdec.c:1275: s0.i ^= sign >> 1 << 31;
aacdec.c:1276: s1.i ^= sign << 31;
aacdec.c:1293: t.i = s.i ^ (sign & 1U<<31);
aacdec.c:1297: t.i = s.i ^ (sign & 1U<<31);
aacdec.c:1301: t.i = s.i ^ (sign & 1U<<31);
aacdec.c:1305: t.i = s.i ^ (sign & 1U<<31);
aacdec.c:1498: *icf++ = cbrt_tab[n]
| (bits & 1U<<31);
aacdec.c:1502: *icf++ = (bits & 1U<<31) | v;
dsputil.c:2434: else if((a^(1U<<31)) > maxisign) return maxi;
dsputil.c:2442: uint32_t maxisign = maxi ^ (1U<<31);
mpegaudiodec.c:851: v = AV_RN32A(src) ^ (get_bits1(&s->gb) << 31); \
vorbisenc.c:446: res |= (1U << 31);
wavpack.c:493: value.u = (sign << 31) | (exp << 23) | S;
wma.c:332: norm = (1.0 / (float)(1LL << 31)) * sqrt(3) *
s->noise_mult;
wma.c:448: iptr[offset & coef_mask] = ilvl[code] ^ sign<<31;
wmaprodec.c:852: AV_WN32A(&ci->coeffs[cur_coeff],
vals[i] ^ sign<< 31);
"""
Assuming IEEE-754 floats and aliasing violations are probably not
strictly identical, but I wanted to point that out.
> you have to use
> AV_RN32() or av_float2int() or some pointer to a union of float+int32
> or something else along thouse lines
I am not sure this is strictly what you mention, but here's something
tested some time ago:
"""
For that particular case (both for win32 and win64), it actually does
what looks perfect:
addl $0x80000000,some_offset(%eax)
And like all other functions that I changed, it actually chose to
completely unroll them.
Anyway, if what you meant is this:
int i;
for (i = 1; i < 64; i += 4)
{
x[i+0] = av_int2float( av_float2int(x[i+0]) ^ 1U<<31 );
x[i+2] = av_int2float( av_float2int(x[i+2]) ^ 1U<<31 );
}
It actually generates extra moves (and 3 times as many insns):
mov 0x14(%eax),%edx
add $0x80000000,%edx
mov %edx,0x14(%eax)
My C-fu may be lacking here, but if instead I do:
union av_intfloat32 *p = (union av_intfloat32*)x;
for (i = 1; i < 64; i += 4)
{
p[i+0].i ^= 1U<<31;
p[i+2].i ^= 1U<<31;
}
then it is ok, but I'm quite unsure I have improved anything.
"""
The details may change (if it's called av_intfloat, the name of the
union members) but you should get the picture.
Given this additional data, do you have any updated opinion?
--
Christophe
More information about the ffmpeg-devel
mailing list