[FFmpeg-devel] [PATCH 2/2] libavcodec: Implementation of AC3 fixed point decoder
christophe.gisquet at gmail.com
Fri Apr 12 19:11:48 CEST 2013
I just wanted to mention that I'm not someone of charge of anything
for ffmpeg: I just wanted to look at a FP implementation.
2013/4/12 Nedeljko Babic <Nedeljko.Babic at imgtec.com>:
>>> + ac3_downmix_c_fixed16(s->outptr, s->downmix_coeffs,
> I didn't want to clog dsp context with this function because it is used only here
> but ok, I'll add it to ac3dsp context in next patch.
I have no idea how much it takes in a runtime profile, so maybe it is
not worth having it.
>>> - memcpy(((float*)frame->data[ch]) + AC3_BLOCK_SIZE*blk, output[ch], 1024);
>>> + memcpy(((SHORTFLOAT*)frame->data[ch]) + AC3_BLOCK_SIZE*blk, output[ch], 1024);
>>I thought SHORTFLOAT was int16_t. How come it is the same copy size?
>>I'm surprised this overwrite does not crash the decoder.
> Depending on flag CONFIG_AC3_FIXED, SHORTFLOAT is either int16_t (for fixed point decoder) or float
> (for floating point decoder). Since output is also defined as SHORTFLOAT, memcpy is working
Yes but output[ch] is AC3_BLOCK_SIZE elements, ie 256. The 1024 is
fine for 256*sizeof(float), but not here: you're overreading and
overwriting the buffers. This 1024 should be changed to
And answering myself, you didn't notice it because this happens only
on error recovery, and you are probably not using corrupted streams.
>>On a side note, I guess this memcpy is long enough to have little
>>overhead compared to an inline copy loop using eg AV_COPY128.
> You are probably correct, but in original implementation (floating point decoder) here is memcpy and
> I left it here in fixed point implementation because it is easier to merge implementations for these
> two decoders (since this is template file for both decoders).
> I can change this if you wish, but the change should be made for original (floating point) decoder also.
No, don't bother, this is be a separate consideration.
>>> +static void scale_coefs (
>>> + int32_t *dst,
>>> + const int *src,
>>> + int dynrng,
>>> + int len)
>>int* pointer? I think the float context has the same issue, but
>>really, this should be int32_t too
> OK. I'll change this.
Just for the record, this is because int is 64bits on some architectures.
>>> +static int ac3_fixed_sqrt(int x)
>>This is entirely dependent on the FP format, but that could go into
>>the Fixed context header (not the struct itself) as a helper function.
> I will move it to header.
I don't know what naming convention we could have, but maybe naming it
fp12_fixed_sqrt(x) would be suited to match the format rather than
where it is used?
Someone else should comment on this FP DSP design.
>>Also, quite a lot less important, but the input is probably 32bits, so
>>there may be wasteful sign extension occurring here.
> I am not sure on which sign extension you are referring
Sorry, I was thinking of another architecture, that will not use a FP
>>> + dst[i] = av_clip_int16_c(((((int64_t)s0*wj - (int64_t)s1*wi + 0x40000000) >> 31) + 0x80) >> 8);
>>> + dst[j] = av_clip_int16_c(((((int64_t)s0*wi + (int64_t)s1*wj + 0x40000000) >> 31) + 0x80) >> 8);
>>I'm stuck on this, not really seeing how useful my remark can be, but...
>>Sometimes, you have shuffle insns taking chuncks of 8, 16 or 32bits of
>>a SIMD register and reordering them. Had the shift been 32, this would
>>have been possible.
>>Would it be possible to have 1 more bit of precision on the window
>>coeffs (and still be usable and fit on 32bits)?
> It is not possible because value would be out of range. This is classical fixed point operation.
OK, that was just a thought. Some particular input could cause overflow indeed.
More information about the ffmpeg-devel