[FFmpeg-devel] MMX accelerated DSP functions for VC1/WMV3 decoders

Michael Niedermayer michaelni
Sat Jun 30 21:07:06 CEST 2007


Hi

On Sat, Jun 30, 2007 at 08:35:17PM +0200, Christophe GISQUET wrote:
> Hi,
> 
> Michael Niedermayer a ?crit :
> >> +#if defined(CONFIG_VC1_DECODER) || defined(CONFIG_WMV3_DECODER)
> >> +extern void ff_vc1dsp_init_mmx(DSPContext* dsp, AVCodecContext *avctx);
> >> +#endif
> >> +
> > 
> > the #if is unneeded
> 
> Indeed, even if defined, the symbol won't be used when those conditions
> are not met.
> 
> > [...]
> >> +     "psllw     $1, %%mm1               \n\t"                   \
> >> +     "psllw     $1, %%mm2               \n\t"                   \
> > 
> > paddw
> 
> Is that always faster?

no, you can design a cpu where its not


> 
> > duplicating each filter 4 times with macros is unacceptable
> > the overhead for 2 calls is not that big
> 
> OK. If I understand right the plan, you want instead 4 functions to be
> created, one per shift position. If we say that {1,2,3}/4 shift code
> sizes are N (and neglect the no-shift code size), we currently have a
> total code size of:
> 3*3*(N+N) + 2*3*(N+epsilon) + epsilon? ~ 24N
> Your plan is to get it to 3*N + epsilon ~ 3N
> 
> However, with this, the same function is used for vertical and
> horizontal filtering. The tap offsets is no longer known at compilation,
> hence we have a more complex addressing pattern (of the [eax+ecx+N]
> kind) and a register less. And I probably have to rewrite part of the
> macros.
> 
> What would you say about having 1 vertical function and 1 horizontal for
> {1,2,3}/4 shift positions? This should double the code size compared to
> your plan, but looks much simpler to me.

currently the code does run a dummy do nothing filter in 6 out of 15 cases
this is not good, if there where a general variable tap offset supported
then i think it should be easier to skip these dummy filter_0 copy thing

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070630/2fd45061/attachment.pgp>



More information about the ffmpeg-devel mailing list