[Ffmpeg-devel] Audio/Video Syncing...

Ryan Martell rdm4
Sat Oct 14 04:15:20 CEST 2006

On Oct 13, 2006, at 7:06 PM, Michael Niedermayer wrote:

> Hi
> On Fri, Oct 13, 2006 at 05:58:57PM -0500, Ryan Martell wrote:
>> Hi--
>> I have implented the rtsp streaming over h264.  Everything works
>> against Darwin Streaming Server (packetization mode 1 & 2).  The only
>> problem is that I can't figure out what I'm doing wrong on getting
>> the audio/video to sync.
>> I have called this on the stream:
>>     av_set_pts_info(stream, 33, 1, 90000)
>> And 90kHz is the standard for h264 streaming.
>> When I let ffmpeg figure out the frame rate, sometimes it shows up as
>> 10 fps(r), and other times it shows up at 29.97 fps(r).
>> If I hardcode the fps (num=30000, and den= 1001), it will still
>> sometimes playback at 10.
> well the question here is what is the real framerate of the stream
> or in other words how many frames do you pass to libavformat per
> second? and what is the difference between the pts values of the
> frames you pass to lavf

Well, I'm passing them as fast as I get them, which I suspect is  
close to 30fps.  The source movie is 30fps, and occasionally  
(probably due to ffplay's a/v syncing stuff) it plays faster than 30  

I'm wondering if the frame rate randomness is some sort of artifact  
of when the rtcp packet shows up?  Again, the stream info (with the  
fps) doesn't show up before some delay (which I haven't fully  
investigated yet.  Maybe my frame rate is affected by the arrival of  
the rtcp packet?  (I'm going to try to discard all packets that  
arrive before the rtcp packet shows up, and see what happens).

Here's an output with DEBUG_SYNC defined, at the point that it  
received an RTCP packet...

frame_type=P clock=5240.489 pts=5240.489
audio: delay=0.023 clock=13.026 pts=13.003
audio: delay=0.023 clock=13.050 pts=13.026
audio: delay=0.023 clock=13.073 pts=13.050
audio: delay=0.023 clock=13.096 pts=13.073
video: delay=0.100 actual_delay=0.104 pts=5240.522 A-V=-5227.426223
frame_type=B clock=0.031 pts=0.031
audio: delay=0.023 clock=13.119 pts=13.096
audio: delay=0.023 clock=13.142 pts=13.119
audio: delay=0.023 clock=13.166 pts=13.142
audio: delay=0.023 clock=13.189 pts=13.166
video: delay=0.100 actual_delay=0.098 pts=5240.489 A-V=-5227.299966
frame_type=P clock=-0.002 pts=-0.002
audio: delay=0.023 clock=13.212 pts=13.189
audio: delay=0.023 clock=13.235 pts=13.212
audio: delay=0.023 clock=13.259 pts=13.235
audio: delay=0.023 clock=13.282 pts=13.259
video: delay=0.100 actual_delay=0.102 pts=0.031 A-V=13.250436
audio: delay=0.023 clock=13.305 pts=13.282
frame_type=B clock=0.098 pts=0.098  222KB sq=    0B
audio: delay=0.023 clock=13.328 pts=13.305
audio: delay=0.023 clock=13.351 pts=13.328
audio: delay=0.023 clock=13.375 pts=13.351
audio: delay=0.023 clock=13.398 pts=13.375
video: delay=0.100 actual_delay=0.096 pts=-0.002 A-V=13.399914
frame_type=P clock=0.065 pts=0.065
audio: delay=0.023 clock=13.421 pts=13.398
audio: delay=0.023 clock=13.444 pts=13.421
audio: delay=0.023 clock=13.468 pts=13.444
audio: delay=0.023 clock=13.491 pts=13.468
video: delay=0.100 actual_delay=0.098 pts=0.098 A-V=13.392660
frame_type=B clock=0.165 pts=0.165
audio: delay=0.023 clock=13.514 pts=13.491
audio: delay=0.023 clock=13.537 pts=13.514
audio: delay=0.023 clock=13.560 pts=13.537
audio: delay=0.023 clock=13.584 pts=13.560
video: delay=0.100 actual_delay=0.103 pts=0.065 A-V=13.518918

Note that the pts is still not strictly increasing....

>> When I enable DEBUG_SYNC in ffplay, the pts value I'm giving it
>> starts off around -9000 (this is because there is no rtcp packet yet
>> to sync the timebase to, I think), then jumps to 0, and proceeds to
>> go up from there.  The A-V delay will end up around 11 seconds at
>> times, other times it will be closer to zero.
> does the delay stay constant or does it increase?
> and can you elaborate on the rtcp packet sync thing ...

The delay tends to decrease, albeit slowly.

The rtcp packet sync thing is this code:

     if (s->last_rtcp_ntp_time != AV_NOPTS_VALUE) {
         int64_t addend;
         int32_t delta_timestamp;
	ntp format represents seconds and fraction as a 64-bit unsigned  
fixed-point integer with decimal point
	to the left of bit 32 numbered from the left. The 32-bit seconds  
field spans about 136 years, while the
	32-bit fraction field precision is about 232 picoseconds.
         //      fprintf(stderr, " (has timestamp) ");
         /* XXX: is it really necessary to unify the timestamp base ? */
         /* compute pts from timestamp with received ntp_time */
         delta_timestamp = timestamp - s->last_rtcp_timestamp;
//      if(delta_timestamp<0) fprintf(stderr, "Backwards in time: %d 
\n", delta_timestamp);
         /* convert to 90 kHz without overflow */
         addend = (s->last_rtcp_ntp_time - s->first_rtcp_ntp_time) >>  
         addend = (addend * 5625) >> 14;
         adjusted_timestamp = addend + delta_timestamp;

This code comes from rtp.c.  The aac stuff doesn't adjust pts  
anywhere in the rtp code (that I can see), so I'm not sure how it's  
generating it's pts.

Essentially, because of above, before the first rtcp packet, i have  
pts values that aren't necessarily scaled correctly.  The Stream Info  
won't show up (ffplay -stats) until the first rtcp packet shows up,  
so there is no audio or video playing, but the packets are being  
cached.  I guess I could pitch them until we get the first packet,  
and see what that does....

>> I suspect there is something obvious that I'm not doing, but I don't
>> know where to look.
> well, are the pts values on the audio and video AVPackets you output
> correct? (a simple printf which prints them should awnser that, a
> 10 second difference should be pretty easy to spot ...)

After the rtcp packet, I think they are, or they are very close  
(which could be an artifact of adjusting the timestamps- it's  
possible that I don't need to do that, or maybe i just need to  
subtract the latst_rtcp_timestamp first).

Thanks again, Michael.


PS: Did you get a chance to look over my code on the other thread to  
see if it was getting close to acceptable?

More information about the ffmpeg-devel mailing list