[FFmpeg-devel] [PATCH 2/3] Indeo 5 decoder: common DSP functions

Kostya kostya.shishkov
Tue Jan 19 09:10:14 CET 2010


On Sun, Jan 17, 2010 at 08:32:05PM +0100, Reimar D?ffinger wrote:
> Do these functions have any alignment requirements?
> Or can you promise a certain alignment on the inputs?

They are pure C, so none. But decoder still uses 4/8/16-byte alignment
for corresponding input.

> For anything that's speed-critical such documentation might
> speed up future optimization attempts.
> 
> On Fri, Jan 15, 2010 at 12:05:47PM +0200, Kostya wrote:
> > /**
> >  *  two-dimensional inverse slant 8x8 transform
> >  *
> >  *  @param  in      [in]  pointer to the vector of transform coefficients
> >  *  @param  out     [out] pointer to the output buffer (frame)
> >  *  @param  pitch   [in]  pitch to move to the next y line
> >  *  @param  flags   [in]  pointer to the array of column flags:
> >  *                        != 0 - non_empty column, 0 - empty one
> >  *                        (this array must be filled by caller)
> >  */
> > void ff_ivi_inverse_slant_8x8(int32_t *in, int16_t *out, uint32_t pitch, const uint8_t *flags);
> > 
> > /**
> >  *  two-dimensional inverse slant 4x4 transform
> >  *
> >  *  @param  in      [in]  pointer to the vector of transform coefficients
> >  *  @param  out     [out] pointer to the output buffer (frame)
> >  *  @param  pitch   [in]  pitch to move to the next y line
> >  *  @param  flags   [in]  pointer to the array of column flags:
> >  *                        != 0 - non_empty column, 0 - empty one
> >  *                        (this array must be filled by caller)
> >  */
> > void ff_ivi_inverse_slant_4x4(int32_t *in, int16_t *out, uint32_t pitch, const uint8_t *flags);
> 
> Why are those "in" not const if they are really only input as the documentation says?
> The same for many more functions here.

Non-DC slant transform uses input also for temporary values since we
don't need it after this transform.
 
> > /**
> >  *  This is a speed-up version of the inverse 2D slant transforms
> >  *  for the case if there is a non-zero DC coeff and all AC coeffs are zero.
> 
> Hm? I think "sped-up" might be grammatically correct, but how about
> "A inverse 2D slant transform optimized for the all-zero AC coefficients case"
> (is the non-zero DC relevant here?)?

I'd rather throw those words away, any guy knowing SIMD can only smirk.

> > /**
> >  *  Copies the pixels into the frame buffer.
> >  */
> > void ff_ivi_put_pixels_8x8(int32_t *in, int16_t *out, uint32_t pitch, const uint8_t *flags);
> > 
> > /**
> >  *  Copies the DC coefficient into the first pixel of the block and
> >  *  zeroes all others.
> >  */
> > void ff_ivi_put_dc_pixel_8x8(const int32_t *in, int16_t *out, uint32_t pitch, int blk_size);
> > 
> > /**
> >  *  8x8 block motion compensation with adding delta
> 
> If I remember right, there are dsputils functions that do something similar.
> Describing how these differ might be helpful.

should be obvious from argument types

> > /** butterfly operation for the inverse slant transform */
> > #define IVI_SLANT_BFLY(x, y) t1 = x-y; x += y; y = t1;
> > 
> > /** This is a reflection a,b = 1/2, 5/4 for the inverse slant transform */
> > #define IVI_IREFLECT(s1, s2) {\
> >     t1 = s1 + ((s1 + s2*2 + 2) >> 2);\
> >     s2 = ((s1*2 - s2 + 2) >> 2) - s2;\
> >     s1 = t1;}
> > 
> > /** This is a reflection a,b = 1/2, 7/8 for the inverse slant transform */
> > #define IVI_SLANT_PART4(s1, s2) {\
> >     t1 = s2 + ((s1*4 - s2 + 4)  >> 3);\
> >     s2 = ((-s1 - s2*4 + 4) >> 3) + s1;\
> >     s1 = t1;}
> 
> You're supposed to put the arguments in () for macros to avoid
> really nasty to debug issues.
> Also, why is t1 nor declared inside the macro (of course better with
> a name that is less likely to cause name clashes).
> Well, in the IVI_INV_SLANT8 macros probably would still be ok, too,
> but having a large function with a variable that is only modified
> hidden in some macros is not so great.

added it explicitly to macros
 
> >             memset(out, 0, 8*sizeof(int16_t));
> 
> sizeof(*out) maybe?
> There's the same code another time later

three times, changed

> > void ff_ivi_dc_slant_2d(const int32_t *in, int16_t *out, uint32_t pitch, int blk_size)
> > {
> >     int     x, y;
> >     int16_t dc_coeff;
> > 
> >     dc_coeff = (*in + 1) >> 1;
> > 
> >     for (y = 0; y < blk_size; out += pitch, y++) {
> >         for (x = 0; x < blk_size; x++)
> >             out[x] = dc_coeff;
> >     }
> > }
> 
> Seems like we could really use some memset function that sets not just 1-byte values...
> Assuming we have some appropriate constraints on out alignmen and blk_size this sure
> could be optimized anyway.

yes, there are many places in FFmpeg where such memset would be useful
 
> > void ff_ivi_dc_row_slant(const int32_t *in, int16_t *out, uint32_t pitch, int blk_size)
> 
> Same applies here.
> 
> >     int     i, t1, row2, row4, row8;
> > 
> >             out[0] = out[pitch] = out[row2] = out[row2 + pitch] = out[row4] =
> >                     out[row4 + pitch] =  out[row4 + row2] = out[row8 - pitch] = 0;
> 
> Some really messed up alignment?

re-aligned

> > void ff_ivi_put_pixels_8x8(int32_t *in, int16_t *out, uint32_t pitch,
> >                            const uint8_t *flags)
> > {
> >     int     x, y;
> > 
> >     for (y = 0; y < 8; out += pitch, in += 8, y++)
> >         for (x = 0; x < 8; x++)
> >             out[x] = in[x];
> 
> Why not use memcpy or at least use uint32_t * casts? Or is the alignment not
> sufficient?

int32 -> int16
 
> > void ff_ivi_put_dc_pixel_8x8(const int32_t *in, int16_t *out, uint32_t pitch,
> >                              int blk_size)
> > {
> >     int     y;
> 
> Why those strange spaces?

vertical alignment for variable names, look in the source above

> >     out[0] = in[0];
> >     memset(&out[1], 0, 7*sizeof(int16_t));
> 
> out + 1 is the more common way to write it in FFmpeg I think.

changed along with sizeof

> > #define op_put(a, b)  (a) = (b)
> > #define op_add(a, b)  (a) += (b)
> 
> I think macros should always be uppercase.

here you are
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ivi_dsp.h
Type: text/x-chdr
Size: 6699 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100119/6514b92f/attachment.h>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ivi_dsp.c
Type: text/x-csrc
Size: 15183 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100119/6514b92f/attachment.c>



More information about the ffmpeg-devel mailing list