[FFmpeg-devel] [PATCH] 10-bit DNxHD support v4 review request

Thu Apr 7 00:37:04 CEST 2011

Hi Joseph,

On 04/06/2011 03:08 AM, Joseph Artsimovich wrote:
> Hi,
> 
> Please review the latest version of my patch set.  The renaming patch needs to be applied first.
> Regarding still unresolved issues raised by Baptiste Coudurier, I reproduce below my earlier email where I explain why things are done the way they are and request further input, which I haven't received so far.  Some of the issues discussed below no longer apply to the latest version of the patch.
> 
> 
> -----Original Message-----
> From: ffmpeg-devel-bounces at ffmpeg.org [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of Joseph Artsimovich
> Sent: Thursday, March 24, 2011 9:51 AM
> To: FFmpeg development discussions and patches
> Subject: Re: [FFmpeg-devel] [PATCH] 10-bit DNxHD decoding and encoding
> 
>>> -static const uint16_t dnxhd_1238_run_codes[62] = {
>>> +static const uint16_t dnxhd_1235_1238_1241_run_codes[62] = {
>>>      0, 4, 10, 11, 24, 25, 26, 27, 56, 57, 58, 59, 120, 242, 486,
>>> 487, 488, 489, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989,
>>> 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000, 1001, 1002,
>>> 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013,
>>> 1014, 1015, 1016, 1017, 1018, 1019, 1020, 1021, 1022, 1023,
>>
>> Please rename in a separate patch.
> Done.  The patch is attached.
> 
>>> @@ -40,6 +43,7 @@ typedef struct {
>>>      VLC ac_vlc, dc_vlc, run_vlc;
>>>      int last_dc[3];
>>>      DSPContext dsp;
>>> +    int dsp_bits; // 8, 10 or 0 if not initialized.
>>
>> bit_depth is a better name IMHO.
> Well, this field was meant to indicate how the DSPContext was initialized (if at all).
> For other purposes we have cid_table->bit_depth

I think you can check the current cid bit depth if set already.

>>> +static av_always_inline void dnxhd_10bit_get_pixels_8x8(DCTELEM
>>> +*restrict block, const uint16_t *pixels, int line_size) {
>>> +    for (int i = 0; i < 8; ++i) {
>>> +        for (int j = 0; j < 8; ++j) {
>>> +            block[j] = (DCTELEM)floor((float)pixels[j] * (1023.0f /
>>> +65535.0f) + 0.5f);
>>
>> Btw, why are you using floats here ?
> Hoping a compiler might vectorize this code.

We prefer using ints here, to avoid rounding difference on different archs.

>>> -        dnxhd_get_blocks(ctx, mb_x, mb_y);
>>> +        if (ctx->cid_table->bit_depth == 10) {
>>> +            dnxhd_10bit_get_blocks(ctx, mb_x, mb_y);
>>> +        } else {
>>> +            dnxhd_get_blocks(ctx, mb_x, mb_y);
>>> +        }
>>
>> I think a function pointer in context would be better.
> OK.
> 
> 
>>>          for (i = 0; i < 8; i++) {
>>>              DCTELEM *src_block = ctx->blocks[i]; @@ -439,6 +549,8
>>> @@ static int dnxhd_calc_bits_thread(AVCodecContext *avctx, void
>>> *arg, int
>> jobnr, i
>>>              diff = block[0] - ctx->m.last_dc[n];
>>>              if (diff < 0) nbits = av_log2_16bit(-2*diff);
>>>              else          nbits = av_log2_16bit( 2*diff);
>>> +
>>> +            assert(nbits < ctx->cid_table->bit_depth + 4);
>>
>> assert would kill the program.
> That would indicate a serious bug.  The assert merely checks we are not reading past the end of a table.
> BTW, I've come by the following comment in mpegvideo_enc.c:
> //non c quantize code returns incorrect block_last_index FIXME If still true, that could trigger this assert.
> 

It's a problem, but can this actually happen ?

>>> +// To get DCT coefficients multiplied by 4.
>>
>> Shouldn't we avoid returning scaled output, could be usefull for 11,12
>> bits, etc..
> Well, I guess it's done to get more precise quantization.  I would be fine with having an unscaled version though.  After all, the range of DCT coefficients is already 3 bits larger than the range of input samples.
> 
> 
>>> [...]
>>>
>>> diff --git a/libavcodec/mpegvideo.h b/libavcodec/mpegvideo.h index
>>> 3e447ab..eb2352f 100644
>>> --- a/libavcodec/mpegvideo.h
>>> +++ b/libavcodec/mpegvideo.h
>>> @@ -49,7 +49,19 @@ enum OutputFormat {  #define MPEG_BUF_SIZE (16
>> *
>>> 1024)
>>>
>>>  #define QMAT_SHIFT_MMX 16
>>> -#define QMAT_SHIFT 22
>>> +
>>> +/*
>>> + * QMAT_SHIFT should be as high as possible, but not so high to
>>> +cause
>>> + * overflow in the following expression:
>>> + * scaled_dct[i] * ((p / s) * ((1 << QMAT_SHIFT) / (qscale *
>>> +weight_table[i]))))
>>> + * where s is a scale factor applied to DCT coefficients while
>>> + * p and weight_table are codec-specific.
>>> + * For example, the worst case for 10-bit DNxHD would be:
>>> + * p == 32, s == 4, qscale == 1, weight_table[i] == 32,
>>> +scaled_dct[i] == (1 << 13) * 4
>>> + * which gives (1 << 15) * (((1 << QMAT_SHIFT) >> 5) << 3)
>>> + * which gives 18 as the largest safe QMAT_SHIFT.
>>> + */
>>> +#define QMAT_SHIFT 18
>>>
>>>  #define MAX_FCODE 7
>>>  #define MAX_MV 2048
>>> diff --git a/libavcodec/mpegvideo_enc.c b/libavcodec/mpegvideo_enc.c
>>> index 9255fa8..3f46033 100644
>>> --- a/libavcodec/mpegvideo_enc.c
>>> +++ b/libavcodec/mpegvideo_enc.c
>>> @@ -3725,9 +3725,14 @@ int dct_quantize_c(MpegEncContext *s,
>>>              else
>>>                  q = s->c_dc_scale;
>>>              q = q << 3;
>>> -        } else
>>> +        } else {
>>>              /* For AIC we skip quant/dequant of INTRADC */
>>> -            q = 1 << 3;
>>> +            if (s->dsp.fdct == ff_faandct_10bit_safe) {
>>> +                q = 1 << 2;
>>> +            } else {
>>> +                q = 1 << 3; // The post-scale applied to a DCT output.
>>> +            }
>>> +        }
>>>
>>>          /* note: block[0] is assumed to be positive */
>>>          block[0] = (block[0] + (q >> 1)) / q;
>>
>> dct_quantize_mmx/sse2/etc .. is used if present, see QMAX_SHIFT_MMX.
> I specifically request FF_DCT_10BIT, so I won't get any of those.

Ultimately you can only replace the dct in dct_quantize.
Besides, this modification makes a change that is specific to dnxhd in
generic mpeg code, which I'm not a big fan of.

>> I think it's time to separate the dnxhd quantization fonction from the
>> mpeg one and change QMAT_SHIFT only there.
>> Remove MpegEncContext from DNxHDEncContext, add dnxhd_template_mmx.c
>> in
>> x86 and clean it up (intra only, out format, aic only etc..)
> I believe we only use dct_quantize() from MpegEncContext, and there exists quite a few assembly versions of that.  I don't think it's a good idea to duplicate that code.

The code has overflow problems already due to the shift.

[...]

-- 
Baptiste COUDURIER
Key fingerprint                 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
FFmpeg maintainer                                  http://www.ffmpeg.org