[Ffmpeg-devel] upsampling of subsampled video data

Michael Niedermayer michaelni
Tue Sep 12 23:10:17 CEST 2006


On Tue, Sep 12, 2006 at 09:24:36PM +0200, Attila Kinali wrote:
> On Tue, 12 Sep 2006 21:04:03 +0200
> Michael Niedermayer <michaelni at gmx.at> wrote:
> > > Most hardware does bicubic interpolation. Lanczos/sinc would be more
> > > theoretically correct but gauss is the most artifact-immune and also
> > > IMO the most theoretically correct since it's the only filter that's
> > > radially decreasing in both spatial and frequency domains. But
> > > probably anything more than bicubic is too expensive to do in
> > > hardware.
> > 
> > i think a theoretically optimal scaler would first classify each output
> > sample depending on its surrounding input somehow in one of several 
> > categories (based on edge direction and or variance for example) and then
> > apply depending on that a (possibly precomputed) optimal linear filter
> Could you elaborate this a little bit? Especialy on how you would
> do classification and why you would apply different filters
> (and if you feel like it, which filter would you apply in
> which case?).

this was more intended as a theoretical idea and not so much for an
actual hw implementation due to complexity but the idea is that you would

use different scalers in different parts of the image, one simple possibility
would be to interpolate along edges so that diagonal edges dont get
interpolated by using vertical + horizontal interpolation which turns them 
into blurred staircases but instead would interpolate along the edge, for
example (from libmpcodecs/vf_yadif.c)

int spatial_pred= (c+e)>>1;
int spatial_score= ABS(cur[-refs-1] - cur[+refs-1]) + ABS(c-e)
                 + ABS(cur[-refs+1] - cur[+refs+1]) - 1;
#define CHECK(j)\
    {   int score= ABS(cur[-refs-1+j] - cur[+refs-1-j])\
                 + ABS(cur[-refs  +j] - cur[+refs  -j])\
                 + ABS(cur[-refs+1+j] - cur[+refs+1-j]);\
        if(score < spatial_score){\
            spatial_score= score;\
            spatial_pred= (cur[-refs  +j] + cur[+refs  -j])>>1;\

                        CHECK(-1) CHECK(-2) }} }}
                        CHECK( 1) CHECK( 2) }} }}

hmm, my code looks terribly obfuscated, i hope it makes sense ...

now the above is just a simplified special case, ideally one should clasify
pixels in more then just 5 types depening on some edge detector, maybe a
iterative algorithm like:
1. downscale the given image with a filter similar to what we expect to
   be used for our input)
2. classify each pixel (initially simple edge direction + variance or the dominant
   dct component or ... could be used)
3. find the linear least squares filters for each class (we have the orignal
   image as we downscaled it in 1.)
4. find some optimal classificaton fuction for each class (maybe using
   fisher discriminants or such
5. goto 2

this could of course be done with some training images "offline" or with each
image before scaling or with the whole video before scaling the whole video


Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or copy any book
Today you'd get arrested for mere telling someone where the library is

More information about the ffmpeg-devel mailing list