[FFmpeg-devel] [RFC] libavfilter audio API and related issues

Mon May 24 17:53:21 CEST 2010

On Sat, May 22, 2010 at 10:37:18PM -0700, S.N. Hemanth Meenakshisundaram wrote:
> On 05/02/2010 12:08 PM, Stefano Sabatini wrote:
>> On date Wednesday 2010-04-28 07:07:54 +0000, S.N. Hemanth 
>> Meenakshisundaram encoded:
>>    
>>> Stefano Sabatini<stefano.sabatini-lala<at>  poste.it>  writes:
>>>      
>>>> Follow some notes about a possible design for the audio support in
>>>> libavfilter.
>>>>
>>>> AVFilterSamples struct
>>>> ======================
>>>>
>>>> (Already defined in afilters, but renamed AVFilterBuffer at some
>>>> point.)
>>>>
>>>> Follows a possible definition (with some differences whit respect to
>>>> that currently implemented in afilters):
>>>>
>>>>        
>>> [...]
>>>      
>>>> Audio/video synchronization
>>>> ===========================
>>>>
>>>> Some design work has to be done for understanding how request_samples()
>>>> and request_frame() can work togheter.
>>>>
>>>> I'm only considering ffplay for now, as it looks simpler than ffmpeg.
>>>>
>>>> Currently audio and video follows two separate paths, audio is
>>>> processed by the SDL thread thorugh the sdl_audio_callback function,
>>>> while the video thread reads from the video queue whenever there are
>>>> video packets available and process them.
>>>>
>>>>        
>>> Currently, the sdl audio callback gets a decoded audio buffer via the
>>> audio_decode_frame call and then seems to be doing AV sync via the
>>> synchronize_audio call.
>>>
>>> I was thinking about replacing this with the audio_request_samples 
>>> function
>>> suggested above. This is similar to what happens in video. The 
>>> request_samples
>>> would then propagate backwards through the audio filter chain until the 
>>> input
>>> audio filter (src filter) calls the audio_decode_frame to get decoded 
>>> audio
>>> samples and then passes them up the filter chain for processing.
>>>      
>> Yes something of the kind get_filtered_audio_samples().
>>
>> Note also that the filterchain initialization is currently done in
>> video_thread(), that should be moved somewhere else.
>>
>>    
>>> Does this sound ok? Since the sdl_audio_callback will be making a
>>> synchronize_audio call only after this, any additional delay introduced 
>>> by
>>> filtering would also get adjusted for.
>>>      
>> Looks (sounds?) fine, a more general solution may require a
>> synchronization filter, but I suspect this would require a significant
>> revamp of the API.
>>
>> Regards.
>>    
>
> Hi,
>
> I started off trying to make the ffplay changes required for audio 
> filtering just to get an idea of what all will be required of an audio 
> filter API. Attached is a rudimentary draft of the changes. It is merely to 
> better understand the required design and based on this I have the 
> following questions and observations about the design:
>
> 1. ffplay currently gets only a single sample back for every 
> audio_decode_frame call (even if encoded packet decodes to multiple 
> samples). Should we be putting each sample individually through the filter 
> chain or would it be better to collect a number of samples and then filter 
> them together?

you arent talking of a sample in the sense of sizeof(int16_t)*channels are
you?
audio_decode_frame()
surely returns a frame, which is very likely more than 1 sample
merging several frames to reduce overhead does make sense but this
should where possible be done at the decoder. And whats more important it
must be optional as larger frames also mean more delay and this
may be unwanted. 
For decoders/parsers for which such merging is impossible it of course could
be done in ffplay.
Also when merging the L1 cache size should be kept in mind and one should
benchmark it.

>
> 2. Can sample rate, audio format etc change between samples? If not, can we 
> move those parameters to the AVFilterLink structure as Bobby suggested 
> earlier? The AVFilterLink strructure also needs to be generalized.

there is nothing that prevents one from creating a file where such things
change. (and we try to support all files)
Thus i think it is important to support this. How seamless this
is done is up for discussion of course but simply crashing or exiting or
otherise misbehaving is not ok. rebuilding the whole filter graph could be
considered if its simpler.

>
> 3. The number of channels can also be stored in the filter link right? That 
> way, we will know how many of the data[8] pointers are valid.

i suspect 2ch, 1ch and more channels can be intermixed
on tv captures. (for example when a commercial starts or stops in the
middle of a movie)

>
> 4. Do we require linesize[8] for audio. I guess linesize here would 
> represent the length of data in each channel. Isn't this already captured 
> by sample format? Can different channels ever have different datasizes for 
> a sample?

we must support planar and interleaved/packed audio formats
linesize would simplify generically accessing these samples when it
would be the difference between sample i and i+1 (that is == 1 for planar
and == channels for packed)

also please avoid memcpy() or explain why each is unavoidable in your patch

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Incandescent light bulbs waste a lot of energy as heat so the EU forbids them.
Their replacement, compact fluorescent lamps, much more expensive, dont fit in
many old lamps, flicker, contain toxic mercury, produce a fraction of the light
that is claimed and in a unnatural spectrum rendering colors different than
in natural light. Ah and we now need to turn the heaters up more in winter to
compensate the lower wasted heat. Who wins? Not the environment, thats for sure
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100524/848ccfad/attachment.pgp>