[Libav-user] Some encoding A/V synchronization questions

Ulf Magnusson ulfalizer at gmail.com
Sat Nov 16 11:02:06 CET 2013

On Thu, Nov 14, 2013 at 5:54 AM, Andy Shaules <bowljoman at gmail.com> wrote:
> On 11/13/2013 7:40 PM, Ulf Magnusson wrote:
>> Hi,
>> I'm adding a movie recording feature to an emulator using libav. I've
>> got video (x264), audio (Android VisualON AAC), and muxing (mp4)
>> working, but since I make no special efforts to keep audio and video
>> synchronized, they desynchronize after a few minutes. My questions are
>> as follows:
>> 1. I believe I will need to derive PTS values for the audio and video
>> streams, diff them, and compensate by e.g. changing the audio
>> resampling rate when they drift apart. For audio, the PTS values from
>> the AVStream seem reliable, but not for video (they seem to lag
>> behind, perhaps due to buffering (?)). Is it safe to simply use the
>> PTS values I write into the AVFrame for video frames? Is there some
>> other field I could use?
>> 2. Why does ffmpeg.c assign DTS values to PTS values in some locations,
>> e.g.
>> ist->next_pts = ist->pts = av_rescale_q(pkt->dts, ist->st->time_base,
>> 3. Is there some simpler solution specific to x264/AAC/mp4?
>> Thanks,
>> Ulf
>> _______________________________________________
>> Libav-user mailing list
>> Libav-user at ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/libav-user
> Hello,
> Mostly there is no guarantee as to how hardware will provide samples up
> front. the variables are:
> A: One will arrive first with samples available.
> B: One stream will have timestamps earlier than the other, and it may be
> either one of (A).
> I buffer both channels until I have overlap.
> The general rule is , 'write audio first'.
> The routine I run is this:
> After compressing the raw samples, I have two FIFO queues, one for aud and
> one for vid. This is what I typically run for static video frame rates.
> ..
> 1) is there audio packet? yes, goto 2.
> 2) is there video? yes, goto 3.
> 3) Is this the first time you have both channels? yes, goto 4, no goto 5.
> 4) Consider dropping samples. Do you have lots of audio without video or
> lots of video before audio?
> 5)Write all audio packets up to the earliest video frame time, and then
> write that one video frame to the container.
> 6) Repeat. goto1.
> Short answer...
> In general, you need to buffer and mux those timestamps in ascending order
> from either media type, writing them to your container as soon as they line
> up within the range of your camera interval. Dumping early media until they
> do line up. Dont mess with the timestamps.
> have fun!
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user

After some more digging through ffmpeg.c I ended up modifying the
resampling rate to keep audio and video in synch (similar to what the
-async option does). The code is at
https://github.com/ulfalizer/nesalizer/blob/master/movie.cpp in case
anyone is interested. It's written to work out of the box on Ubuntu
13.10, and so uses a few deprecated APIs.


More information about the Libav-user mailing list