[FFmpeg-devel] [rfc] qualification task: SSE2 IDCT

Michael Niedermayer michaelni
Wed Apr 2 12:49:38 CEST 2008


On Wed, Apr 02, 2008 at 11:01:18AM +0200, Pascal Massimino wrote:
> On Sun, Mar 30, 2008 at 4:53 PM, Michael Niedermayer <michaelni at gmx.at>
> wrote:
> 
> > On Sun, Mar 30, 2008 at 02:25:15PM +0100, Balatoni Denes wrote:
> > [...]
> > > Also if Alexander gets skal to donate his code under LGPL, will it
> > satisfy the
> > > qualification task requirement, as skal's idct iirc is in fact very
> > similar
> > > to app note 922/945 ? :)
> >
> > Is skals idct faster/slower/same speed/unknown relative to ap* ?
> >
> > My original idea for this qualification was to have a AP922/945 SSE2 idct
> > which combines all optimizations from all existing such IDCTs. So the
> > question about any single one being ok is not awnserable. The code has to
> > be compared to see if there are any further improvments possible.
> > Also the output must be binary identical to an existing IDCT
> > (to minimize the issues with idct drift between the ever growing number
> > of idcts)
> > Its a little mystery to me why alex apparently thought this was an easy
> > task.
> >
> > Now iam perfectly fine with a simple SSE2 idct one. This would at least
> > skip the binary identical output problem and the work for me comparing
> > it against other IDCTs. OTOH its harder as there is no existing SSE2 code
> > to base ones work on ...
> >
> > Theres also AMD who have promissed to implement 2 things for us,
> > they are (since months) working on a SSE float aan dct. I think they might
> > be happy if the second task would be a AP945/922 SSE2 IDCT as they already
> > have some code for that.
> 
> 
>   i think it's important to not introduce a "new" idct with a different
>  error-landscape than the ones already around (even if IEEE-1180
>  compliant). We already have the famous Walken-idct and the
>  simple-idct. A new one would cause another round of idct-mismatch

Yes i fully agree.
Btw, the walken idct used in xvid and the walken idct by walken do have a
different error-landscape.



>  problem (that's why there's only my fdct in xvid, for instance, and not
>  the idct).  This being said, so far i recall, you can turn the
> skl_dct_sse.asm
>  into a Walken-exact (bitwise) idct by using the following rounding
>  constants as replacement:

Thanks ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Breaking DRM is a little like attempting to break through a door even
though the window is wide open and the only thing in the house is a bunch
of things you dont want and which you would get tomorrow for free anyway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080402/7e8d67c2/attachment.pgp>



More information about the ffmpeg-devel mailing list