[FFmpeg-devel] [PATCH 04/23] vp3: use hpeldsp instead of dsputil for half-pel functions.

Wed Mar 13 00:45:11 CET 2013

Hi,

On Tue, Mar 12, 2013 at 3:35 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Tue, Mar 12, 2013 at 07:28:14AM -0700, Ronald S. Bultje wrote:
>> From: "Ronald S. Bultje" <rsbultje at gmail.com>
>>
>> This makes vp3 independent of dsputil.
> [...]
>
>> @@ -1694,7 +1695,7 @@ static av_cold int vp3_decode_init(AVCodecContext *avctx)
>>      if (avctx->codec_id != AV_CODEC_ID_THEORA)
>>          avctx->pix_fmt = AV_PIX_FMT_YUV420P;
>>      avctx->chroma_sample_location = AVCHROMA_LOC_CENTER;
>> -    ff_dsputil_init(&s->dsp, avctx);
>> +    ff_hpeldsp_init(&s->hdsp, avctx->flags | CODEC_FLAG_BITEXACT);
>
> why is CODEC_FLAG_BITEXACT added here ?

We seemed to do bitexact tricks in simd init functions specifically
for vp3/theora, so doing this allowed me to move those tricks under a
normal CODEC_FLAG_BITEXACT check, making the init code slightly more
sane. We can remove this if you think it's weird. Maybe it deserves a
comment.

>>      ff_videodsp_init(&s->vdsp, 8);
>>      ff_vp3dsp_init(&s->vp3dsp, avctx->flags);
>>
>> diff --git a/libavcodec/x86/dsputil_mmx.c b/libavcodec/x86/dsputil_mmx.c
>> index 46c283a..0ee721e 100644
>> --- a/libavcodec/x86/dsputil_mmx.c
>> +++ b/libavcodec/x86/dsputil_mmx.c
>> @@ -52,10 +52,7 @@ DECLARE_ALIGNED(8,  const uint64_t, ff_pw_255)  =   0x00ff00ff00ff00ffULL;
>>  DECLARE_ALIGNED(16, const xmm_reg,  ff_pw_512)  = { 0x0200020002000200ULL, 0x0200020002000200ULL };
>>  DECLARE_ALIGNED(16, const xmm_reg,  ff_pw_1019) = { 0x03FB03FB03FB03FBULL, 0x03FB03FB03FB03FBULL };
>>
>> -DECLARE_ALIGNED(8,  const uint64_t, ff_pb_7)    =   0x0707070707070707ULL;
>> -DECLARE_ALIGNED(8,  const uint64_t, ff_pb_1F)   =   0x1F1F1F1F1F1F1F1FULL;
>>  DECLARE_ALIGNED(8,  const uint64_t, ff_pb_3F)   =   0x3F3F3F3F3F3F3F3FULL;
>> -DECLARE_ALIGNED(8,  const uint64_t, ff_pb_81)   =   0x8181818181818181ULL;
>>  DECLARE_ALIGNED(8,  const uint64_t, ff_pb_FC)   =   0xFCFCFCFCFCFCFCFCULL;
>>
>>  DECLARE_ALIGNED(16, const double, ff_pd_1)[2] = { 1.0, 1.0 };
>> diff --git a/libavcodec/x86/dsputil_mmx.h b/libavcodec/x86/dsputil_mmx.h
>> index 4b7a1fd..1e62c53 100644
>> --- a/libavcodec/x86/dsputil_mmx.h
>> +++ b/libavcodec/x86/dsputil_mmx.h
>> @@ -49,10 +49,7 @@ extern const uint64_t ff_pw_255;
>>
>>  extern const xmm_reg  ff_pb_1;
>>  extern const xmm_reg  ff_pb_3;
>> -extern const uint64_t ff_pb_7;
>> -extern const uint64_t ff_pb_1F;
>>  extern const uint64_t ff_pb_3F;
>> -extern const uint64_t ff_pb_81;
>>  extern const xmm_reg  ff_pb_F8;
>>  extern const uint64_t ff_pb_FC;
>>
>> diff --git a/libavcodec/x86/vp3dsp.asm b/libavcodec/x86/vp3dsp.asm
>> index 423866c..a47b8f2 100644
>> --- a/libavcodec/x86/vp3dsp.asm
>> +++ b/libavcodec/x86/vp3dsp.asm
>> @@ -33,12 +33,13 @@ vp3_idct_data: times 8 dw 64277
>>                 times 8 dw 25080
>>                 times 8 dw 12785
>>
>> +pb_7: times 8 db 7
>
> this is also in
> libavcodec/x86/dsputil.asm:pb_7: times 8 db 7
>
> not sure its worth sharing though

So, in dsputil.asm, you'll see:
pb_zzzzzzzz77777777: times 8 db -1
pb_77777777: times 8 db 7

I.e. the pb_7 is actually _part_ of the pb_zzz. since that's
relatively obscure and specific, I would think it makes sense to keep
that local to dsputil.asm. Then, if that's the case, we would probably
want to either export pb_7 _from_ dsputil.asm, or make a copy in
vp3dsp.asm. I don't know yet how to make an exported data symbol in
yasm so I didn't try that.

Simply copying the pb_7 to dsputil_mmx.c causes it to be missed from
the pb_zzzzzzzz77777777 symbol, causing unrelated tests to fail.

Ronald