[Ffmpeg-devel] Using Intel's fDCT

Sat Nov 19 21:21:06 CET 2005

I've been trying to use Intel's fDCT from their IPP libs to see if it is 
faster than the SSE2 one in ffmpeg. I tried simply replacing the line from 
mpegvideo_mmx_template.c

    RENAMEl(ff_fdct) (block); //cant be anything else ...

with Intel's function

    ippiDCT8x8Fwd_16s_C1I( block );

All runs okay (and noticeably faster) but the resulting MPEG2 video produced 
is a mess.

The Intel routine simply does a fDCT on a 8x8 block and writes the results in 
the same place as the original data. There is no initialisation required.

What is going on in ff_fdct_sse2() other than a pure fDCT transform, and have 
you any tips of how I could integrate Intel's routine?

Regards

Graham