[FFmpeg-devel] [GSoC] Qualification Task

Tue Mar 22 16:35:39 CET 2011

On date Tuesday 2011-03-22 13:32:42 +0200, Mina Nagy Zaki encoded:
> On Tuesday 22 March 2011 12:28:10 Peter Ross wrote:
> > On Tue, Mar 22, 2011 at 09:54:02AM +0200, Mina Nagy Zaki wrote:
> > > Hi,
> > > I've ported a small filter from sox, the 'earwax' filter, essentially a
> > > stereo widening effect. It's a relatively simple filter, I was actually
> > > looking to write a general sox wrapper but turns out there would be a
> > > few things that would make this slightly difficult.
> > 
> > Can you elaborate on this, just curious.
> 
> There are two options: 
> 1. Allow for specification of multiple effects and create a sox effect chain. I 
> will not be able to use the public api as sox goes through an init stage then 
> sox_flow_effects() passes all the samples through from input to output. I would 
> have to feed samples into individual effects, in essence rewriting (or adapting 
> :) the sox_flow_effects() and helper functions it uses. Otherwise, a blocking 
> thread a custom input effect could be used.
> 2. Allow only one sox effect per libavfi effect.
>  
> In both cases I can't seem to figure out how I can find the end of stream, which 
> is important for some effects as they need to 'drain' (like the echo
> effect). 

I suppose you read AVERROR_EOF from avfilter_request_frame() when you
reach the end of a stream. Note that I'm a bit rusty with libavfilter
audio, as it is some months I don't touch that code and I don't have a
good memory ;-), but I remember there was a problem with some filters
which needed that (in particular a video concatenation filter).

> This actually leads to a larger discussion of the libavfi audio api in general. 
> We have the filter_samples() callback specifically for audio effects, while video 
> uses the request_frame() callback. Audio filters get samples 'pushed' to them, 
> while video filters AIUI pull frames from previous filters (note: I haven't 
> examined video filtering well). Moreover, request_frame() is actually used by 
> the audio src plugin which then initiates the filter_samples() reaction.
>
> IMHO a unified API that pulls samples rather than pushes samples would be 
> better. So each effect can decide when to drain whenever it wishes (or when it 
> requests samples and doesn't get any). Sox uses something similar to the 
> current ffmpeg setup, pushing samples through one callback then calling a 
> different callback for draining.

Check: cmdutils.c:get_filtered_audio_buffer(), it implements a pull
model, correct me if I'm wrong.

> Another major problem is what you asked at the very end: more than two 
> channels in the widen effect. I was actually planning to send a separate email 
> about this: Channel Negotiation. Not all effects can handle any number of 
> channels, and there's no way to do anything about it (except fail loudly) in 
> the current API.

> Again looking at sox, it does this by having effects declare 
> whether or not they can handle multiple channels. If they can then it simply 
> sends in the entire stream. If they can't then it creates an effect instance 
> for each channel, splits the stream, and 'flows' samples of one channel into 
> each of the effect instances. IMHO we _could_ improve on this.

I suppose the lavfi way would be to let the various filters declare
which are the supported channel layouts during the init stage, and
auto-insert an aconvert filter when it is needed.  For example you
could have:

aconvert=s16:mono, widen
-> an aconvert is automatically inserted before widen
aconvert=s16:mono, aconvert=s16:stereo, widen

...

BTW, thinking at it since this is a port I think it would be better to
use the same name of the original effect, so it's easier for people to
spot the correspondence between the two.
-- 
FFmpeg = Funny Fabulous Mean Picky Encoding/decoding Guru