[FFmpeg-devel] soundtouch filter?

Pavel Koshevoy pkoshevoy at gmail.com
Fri Jun 1 06:44:55 CEST 2012

On 05/31/2012 08:37 AM, Nicolas George wrote:
> Le primidi 11 prairial, an CCXX, Pavel Koshevoy a écrit :
>> I have some concerns regarding timestamps.  In my opinion this
>> filter should not change the timestamps.
> Timestamps should be, as far as possible, consistent with playback duration.
> If the filter is set up to change the pitch and not the speed, the
> timestamps should be unchanged. If the filter is set up to change the speed,
> the timestamps should be scaled accordingly, IMHO.

This filter will consume N samples and output N/tempo samples, without affecting 
pitch.  So, at 0.5 tempo twice as many samples are output.  I am not planning on 
implementing pitch shifting in this filter.

If only audio timestamps are adjusted audio will no longer be in sync with video 
and subtitles.  This may be intentional (perhaps the user is trying to correct a 
lip-sync error caused by a telecine conversion).

However, if the users intention is to slow down or speed up everything (not just 
the audio) and keep audio/video sync then the timestamps of all streams need to 
be transformed according to the current tempo.  Frankly, I am not at all well 
versed in avfilter APIs to know how to do that yet -- I am still trying to 
figure out what an AVFilterPad is.

Regarding my preference to not change the timestamps -- I am simply concerned 
about the effect that this would have on video player time-line if the user 
varies tempo continuously during playback.

As far as I can see there are only a couple of options available for updating 
the timestamps:

1. Timestamps can be calculated with reference to previous frame.  This ensures 
that the transformed timestamps remain monotonically increasing.  However, it 
complicates calculation for time-line duration and seeking, because the 
timestamp mapping function is non-linear.

2. Timestamps can be calculated from the beginning of the stream.  This means 
the entire timeline is linearly transformed according to current tempo.  This 
also means that timestamps will not be monotonically increasing if the user 
jumps from 0.5 tempo to 2.0 tempo.  This is a problem for audio/video renderers 
that expect monotonically increasing timestamps.  If a renderer sees a timestamp 
from the past (as far as the renderer is concerned) it may simply drop the 
"stale" frame.

Given these options, in my player instead of changing timestamps I added a tempo 
field to the decoded frame.  This way the time-line duration and seeking are not 
affected at all, and video/audio renderers can trivially adjust the frame 
duration and maintain synchronization.

Thank you,

More information about the ffmpeg-devel mailing list