[FFmpeg-devel] [PATCH] avfilter/dctdnoiz: rewrite [f/i]dct

Mon Aug 4 18:09:12 CEST 2014

Hi,

On Mon, Aug 4, 2014 at 11:59 AM, Clément Bœsch <u at pkh.me> wrote:

> On Mon, Aug 04, 2014 at 11:44:43AM -0400, Ronald S. Bultje wrote:
> > Hi,
> >
> > On Mon, Aug 4, 2014 at 11:42 AM, Clément Bœsch <u at pkh.me> wrote:
> >
> > > On Mon, Aug 04, 2014 at 11:17:29AM -0400, Ronald S. Bultje wrote:
> > > > Hi,
> > > >
> > > > On Sun, Aug 3, 2014 at 4:27 PM, Clément Bœsch <u at pkh.me> wrote:
> > > >
> > > > > This removes the avcodec dependency and make the code almost twice
> as
> > > > > fast. More to come.
> > > > >
> > > > > The DCT factorization is based on "Fast and numerically stable
> > > > > algorithms for discrete cosine transforms" from Gerlind Plonkaa &
> > > > > Manfred Tasche (DOI: 10.1016/j.laa.2004.07.015).
> > > >
> > > >
> > > > I have no comments on the patch itself, but can you explain why we're
> > > > re-implementing a custom f/idct rather than using the one provided in
> > > > lavcodec? It seems to me that going from fixedpoint/simd'ed to
> float/c
> > > > would be slower, not faster, so there must be more to this patch than
> > > what
> > > > I'm getting from it...
> > > >
> > >
> > > OK so as said in private, I didn't find an accurate (not wrongly "JPEG"
> > > like I originally said) 16x16 DCT in libavcodec.
> > >
> > > You suggested to use the HEVC or VP9 DCT. That's indeed one solution,
> but
> > > we currently have only IDCT for those (AFAIK), and I needed a float
> > > implementation.
> >
> >
> > You mean forward. idct is inverse, fdct is forward, such that
> > idct(fdct(data[][])) =~ data[][].
>
> Yeah sure I meant we have only the IDCT and I also needed the FDCT. The
> "float implementation" was another point.
>
> > You can use the forward transforms
> > provided in libvpx (for vp9) or x265 (hevc), they're quite precise, and
> > already optimized.
>
> Yeah so basically I would have to maintain that port instead of my
> implementation, which doesn't look ideal either (the point of using an
> existing code in FFmpeg is that its maintenance would have been shared).

Right, but it means one half (idct) is fully shared, with optimizations
etc. - and only one half is unshared-but-forked-with-optimizations.

Whereas right now, it's fully unshared and unoptimized...

I agree the 3x3 makes it a little more tricky, so do whatever you feel is
right; we don't want to have to convert datatypes 4x just so we can fit an
integer fdct/idct pair between an otherwise full float chain. If the
overall ultimate goal is for everything to be int, it makes sense, but I
don't know what the plan is.

Ronald