[FFmpeg-devel] [RFC] Affine video transformation filter

Stefano Sabatini stefasab at gmail.com
Thu Jul 5 13:23:56 CEST 2012

On date Wednesday 2012-07-04 13:08:56 -0600, Michael Bradshaw encoded:
> I've been thinking a bit about adding an affine video transformation
> filter, but I could use some feedback on this.

> Originally, I just wanted to add a rotation video filter. There are
> rotting patches on the mailing list for this, but I thought I might
> start from scratch. While I'm really curious to see what kind of
> results Dartmouth's 1996 paper "High Quality Alias Free Image
> Rotation" [1] provides, I decided a simple rotation matrix (with
> nearest, bilinear, etc. sampling/interpolation methods) would be a
> good place to start and could be used as a baseline comparison to
> other rotation algorithms. And if I'm using a rotation matrix, I might
> as well just do a proper affine transformation matrix and provide
> additional operations.

> However, I'm not sure how to best handle the overlapping functionality
> of affine transformations and existing filters. We already have
> transpose, scale, v/hflip (and maybe others?) video filters, which
> makes reflection and scaling with an affine transform redundant.
> Additionally, I'm thinking of how potential future filters (maybe an
> individual rotation filter that uses a particular algorithm) could
> overlap with this. It's this redundancy that I'm trying to decide how
> to handle, and where I really need the most input on.

While an affine transform would cover many already existing and more
specialized filters, it would be generally less efficient *and* with
far less options (consider for example all the many options and
formats supported by libswscale).

So it is fine to have more specialized filters, provided that they're
simpler, more efficient, easier to use than a more generalized filter.

> Individual filters (like the scale and transpose filters) have the
> advantage that they can be optimized for speed and quality. The
> general purpose affine transformation filter would have the advantage
> (aside from providing new functionality like rotation and shearing)
> that it could combine several operations into one filter and perform
> all of them in a single pass (so a
> vflip->rotate->translate->scale->crop filter could just become
> affine->crop). I don't know how realistic it is that someone would
> apply all those affine transformations to one video, however.

> What are your thoughts on the overlapping functionality? Should some
> note along the lines of "if you are applying only a single
> affine/linear transformation with an existing corresponding filter,
> consider using that one filter, as it is likely optimized in its speed
> and quality for that transformation" be put in the docs or something?
> Should overlapping functionality be removed (i.e. disable scaling)
> from the affine transformation filter? I'm leaning towards the former
> rather than the latter.


General observations may be moved to the libavfilter.texi file (which
I plan to merge in filters.texi).
> While I've got your attention, what seems like a sane way to pass
> arguments to an affine transformation filter? I'm thinking something
> along the lines of
> "affine=scale=1.1\,1.3:rotate=15:translate=4\,10:shear=1.2\,2:reflect=-1\,0:sampling=bilinear"
> for example where order matters and options can be repeated (and maybe
> allowing "affine=matrix=x11:x12:x13:x21:x22:x23:x31:x32:x33"). I

If you want my opinion we should support matrix parsing (matlab
notation may do), then you can wrap more specialized filters
(e.g. rotate=theta) by composing the matrix. Matrix parsing may be
useful in ffmpeg.c and for other filters (e.g. a generic convolution

> haven't thought of a way to specify the background fill color that I
> like, though.

Do you mean you don't know which *syntax* or how to implement it? For
the syntax I suppose we should go with named options, for code usage
we have the internal drawutils which should somehow help.

Also check for ideas:

- my rotate filter (supports fill color and dynamic rotation
  controlled by an expression, integer-only processing -> fast, but
  which had a somehow serious problem so was never committed)
- the mp=perspective filter
- libavfilter/transform.h which already implements an API for affine
  transforms (used in deshake)
> Your thoughts are appreciated.

Note also that I'm personally fine with a more specialized filter
(e.g. a rotate filter) if that's your objective, since I know that
designing a generic filter which is both fast and flexible is much
FFmpeg = Faithful & Fanciful Mere Practical Ecumenical Genius

More information about the ffmpeg-devel mailing list