[Ffmpeg-devel] RTP patches & RFC

Ryan Martell rdm4
Thu Oct 12 05:41:36 CEST 2006

On Oct 11, 2006, at 5:27 PM, Michael Niedermayer wrote:

> Hi
> On Tue, Oct 10, 2006 at 08:08:04PM -0500, Ryan Martell wrote:
>>  TODO:
>>  1) Fix video/audio syncing! (frame rate wanders; sometimes it's 10,
>> sometimes 29.97.  I suspect I'm not doing something right.

I still haven't sorted this one out yet; my frame rate seems random  
(I'm obviously not setting something up right).

> [...]
>>  4) I am using my own packet queues.  This means that a packet comes
>> in (from network), gets allocated and copied to a packet, and once a
>> full frame is received, is conglomerated and
>>     copied into an AVPacket.  I got rid of the extra allocations and
>> reallocations and copies with the fragmentation packets (see the
>> add_h264_partial_data)
> either
> A. you support out of order packets (like with a n packet buffer which
>    reorders packets) and then after reordering you output them
> B. you output packets in their order and fragmentation
> either way you should set AVStream->need_parsing=1 and leave the  
> merging
> of packets to the AVParser

Okay, sorry to be dense about this, but I'm still confused.  The  
packets that come over rtp have their size indicated in various ways,  
depending on the packet.  That information is NOT in the NAL, and the  
NAL is NOT preceeded by the start sequence.  So what I was doing was  
converting them to AvC style (preceded by length) packets.  This is  
step one, which I don't think the need_parsing can handle for me  
(right?). (Or does the AVParser know enough about h264 nals to  
determine where they start and end?).  There are no sequence packets  
in the stream (it relies on the rtp timestamps).  Furthermore, if  
they are fragmented, they aren't just a fragmented bitstream, they  
have header data that must be stripped first, and they have to be  
accumulated, and only if certain parameters are met.

Step two, is accumulating all of the packets with the same timestamp  
into a frame.  I can see how the AVParser could do this for me, based  
on the pts of the packet.

If it can handle step one, I'm not seeing it.

If it can handle step two, by setting need_parsing=1 and pushing all  
my individual nal packets to it (with timestamps set), then I can  
implement that quickly enough.  (and reduce code size as well)

> [...]
>> Any thoughts on any of the above, plus any and all criticism/
>> commentary is appreciated!
> some comments below (this is not a real review, just what i stumbled
> across while lookin around a little)

I appreciate it.  At first i was a bit defensive about all the stuff,  
below, but after looking over the things, they are all either things:
1) I should have done better (the entire prepend stuff was initially  
started while i was beating on various options to get it to work),
2) Coding styles (ie: I'm top down, so my external functions are at  
the top, and that requires prototypes, because all of my helper  
functions are at the bottom).

> enum {
>      raw_packet = 0,
>      prepend_length_1_byte,
>      prepend_length_2_bytes,
>      prepend_start_sequence,
>      prepend_length_4_bytes
> };

excellent suggestion.

>> struct h264_packet {
>>     uint8_t *data;
>>     uint32_t length;
>>     uint32_t timestamp;
>>     struct h264_packet *next;
>> };
> timestamps probably need to be 64bit

they're only 32 bit in the rtp packet.

> and maybe you could use AVPacketList

Will check this out; some of it was trying to work within existing  
rtp framework (limit my destruction to the smallest area for a first  

> also use typedef struct to simplify the source code

>> /* ---------------- code */
>> void *new_h264_extradata()
> non static functions need some av_ or ff_ prefix to avoid namespace  
> clashes

>> void free_h264_extra_data(void *d)
> unused function

in function table not in this code. ;-)

>> //    av_set_pts_info(s->st, 33, 1, 90000);
> you must set the timebase correctly (the default of 90khz is very  
> likly not
>  correct)

h264/rtp is automatically 90khz. It's also specified on the sdp line,  
but it will always be 90kHz.  Is this the correct way to set it?

>>     //    fprintf(stderr, "Buf: %p Length: %d\n", buf, len);
> this should be av_log or removed (and that of course applies to all  
> these
> // *printf lines

all fprintfs will be removed (if they were internal testing), or  
av_log if useful to others.

> [...]
>>         case _prepend_length_2_bytes:  //prepend packet with a 2  
>> byte length
>>             assert(len >= 0 && len <= 0xffff);
>>             {
>>                 uint16_t length = len;
>>                 *((uint16_t *) buffer) = BE_16(&length);
> segfault due to lack of alignement on non x86 architectures
> one way to implement this page of messy and broken code is:
> for(i=0; i<bytes; i++)
>     buffer[i]>>(8*(bytes-i-1));

no problem. i might just get rid of prepending anything other than 4  
bytes, as the others are untested...

> [...]
>> static uint8_t hex_chars_to_byte(char *buff)
>> {
>>     uint8_t result = 0;
>>     int ii;
>>     int multiplier = 16;
>>     for (ii = 0; ii < 2; ii++) {
>>         if (buff[ii] >= 'A' && buff[ii] <= 'F') {
>>             result += ((buff[ii] - 'A') + 10) * multiplier;
>>         } else if (buff[ii] >= 'a' && buff[ii] <= 'f') {
>>             result += ((buff[ii] - 'a') + 10) * multiplier;
>>         } else if (buff[ii] >= '0' && buff[ii] <= '9') {
>>             result += ((buff[ii] - '0')) * multiplier;
>>         }
>>         multiplier /= 16;
>>     }
>>     return result;
>> }
> strtol()

surprisingly, never used strol before.  Thanks.

I'm also having to get the rtsp rtcp packet (receiver reports)  
working, because that's why the server is cutting me off after two  

Finally, I can give a Blocksize parameter to the server on RTSP  
initial handshake.  With this, I could specify a maximum packet size,  
which would allow me to preallocate all of my internal packets (using  
around 2k buffers each, since MTU is 1500 by default).  This would  
basically mean that there would be initial memory allocation for  
packets until I had enough in the packet pool, and I could retire  
them to another linked list and pull them from there.  So there would  
be no malloc's after an initial startup period.  I think this would  
be a good thing, but I'm not sure how the rest of FFMPEG feels about  
memory allocation.  Is this a better approach?

More information about the ffmpeg-devel mailing list