[Ffmpeg-devel] Embedded preview vhook/voutfb.so /dev/fb

Wed Mar 28 22:06:47 CEST 2007

On Wed, Mar 28, 2007 at 09:44:25PM +0200, Michael Niedermayer wrote:
> > one thing i didn't mention is that picture filters could indicate that
> > they operate only one certain planes, or that the same operation needs
> > to be performed on each plane. so for planar-to-planar scaling,
> > swscale could in principle even be factored into separate scalers for
> > each plane. not sure if this is desirable or not. i'm more trying to
> > indicate the expressive power of the design rather than suggest a
> > preferred implementation for swscale.
> 
> i think the most important part (for soc, and actually not just that) 
> is to keep the task small rewriting swscale which is full of asm is not an
> option rewriting decoders for a pure pull slice architecture too is likely
> not an option, though optional support for slice pull might be ...

i never proposed anything pull based. my filter api design is
independent of whether the caller wants to use a push model or a pull
model.

> > as far as context, i don't think it hurts to lie that you need more
> > context than you actually do. it will just defer a line or two from
> > being processed during one slice until the next slice. it won't cause
> > any duplicate processing so the only performance penalties that could
> > exist would be cache-related. hopefully any such penalties would be
> > extremely small.
> 
> for cubic interpolation you need 4 lines, now for 4 chroma lines in yv12
> you need 8 luma lines if you dont handle luma and chroma context seperate
> thats twice as much as what is really needed ...
> 
> and slices can be small, huffyuv and similar codecs can provide 1 line
> slices

calling lots of code once per line would probably make performance
worse due to instruction cache and branch prediction issues. the idea
is that the caller (filter manager) could defer filtering until a
sufficient amount of input is available, where "sufficient" could be
tuned per-arch or locally. :)

> > > box blur and similar filters are trivial to implement with
> > > libmpcodec slices, but they need slices to be in a sensible order
> > 
> > they should work with any order as long as you have the sufficient
> > context, no? i don't see why they depend on order at all..
> 
> 1D case of boxblur:
> tmp += in[i] - in[i-100];
> out[i]= tmp/100;
> 
> with slices in correct order this is fairly easy, with wrong order your
> tmp variable will have to be rebuild which needs time and alot ...

With a radius of 100, yes... Typical radius would be something like
3-5. I get your point, but I'm not sure it has practical impact.

BTW, a caller would probably do well to avoid submitting lots of small
slices when context is large.

Rich