[FFmpeg-devel] [PATCH] MMX implementation of VC-1 inverse transforms
Mon Jan 14 21:33:59 CET 2008
On Mon, 14 Jan 2008, Ivan Kalvachev wrote:
> - Why you choose to transpose at all. Just to save time and effort?
> It is usual to have separate version of SIMD depending if they work on
> row or columns. The row and column stages are different and you pass
> the differences as parameters.
Who says it's usual? A transposed scantable and a column/transpose/column
transform is faster than a row/column?transform for iDCT and iHCT, I have
no reason to doubt that applies to VC1's transform as well.
The only benefit of row/column is that pmaddwd adds a little bit of
precision compared to a pure 16bit column transform. But that applies only
to an integer approximation of a real DCT, not if the standard has
already made the 16bit approximation.
More information about the ffmpeg-devel