[FFmpeg-user] DTSes & PTSes

Mark Filipak markfilipak.imdb at gmail.com
Thu May 9 22:14:12 EEST 2024


On 09/05/2024 06.07, Yann Cainjo wrote:
> Hi Mark

Hey Yann
> 
>> DTSes & PTSes only exist in MPEG PESes for I-frames
> 
> 
> Where did you get this information ?

By parsing actual presentation streams. Start reading 24 lines down in this message. Note that email 
will wrap long sentences but its easy to fix them if you look.

> I was reading this article but it seems to me that P and B frames have PTS
> and DTS...
> 
> Understanding Timelines within MPEG Standards
> https://ir.cwi.nl/pub/23650/23650B.pdf

Thank you for that, but it does not resolve the issues.

> Best regards
> Yann
> 
> 
> @@@
> 
> Yann CAINJO
> Photographie, photojournalisme, vidéo, prise de son, podcast & vidéo 360 VR
> Formateur en photographie et audiovisuel
> 06 73 23 73 31
> yann at yanncainjo.com
> http://www.yanncainjo.com


Do any of these 'sound' familiar?
A CFR source makes a target that's slightly VFR.
Non-monotonous DTSes.
Concatenations produce a playback glitch or fail altogether.
Use cases that used to work don't work anymore because someone patched the code to cover a different 
use case.

I believe they all have the cause: Open GOPs in which open B-frames are mishandled. Here are two 
pertinent facts:

1, The exact same video can be in any of four physical packet orders:
..BPBPBBPIBPBP.. //closed GOP, physical stream in PTS order
..PBPBPBBIPBPB.. //closed GOP, physical stream in DTS order
..BPBPBBIBPBP..  //open GOP, physical stream in PTS order
..PBPBIBBPBPB..  //open GOP, physical stream in DTS order
I have seen _all_four_ in _professional_ videos (presumably because MPEG does not specify what an 
MPEG video's physical packet order should be).
Note that I've only seen physical packet order that is either PTS order or DTS order but I imagine 
anything is possible.
DTS order makes the most sense for minimizing buffers, but strangely, most videos are in PTS order.

2, MPEG tags (from H.262 §6.2.2.6 & §6.3.8) in the GOP header to handle open GOPs:
'closed_gop' = '1/0'
'broken_link' = '1/0' //'1': GOP has been edited and B-frames have lost their reference frames.
These flags are in an extention created to force players to ignore the open B-frames.
FFmpeg does _nothing_ with these tags when you cut an open GOP.


It's helpful to show DTS order and PTS order as though they are two separate (imaginary) streams.
They're not two separate streams of course. There's only one stream and DTS & PTS are just numbers.


A closed GOP with physical order: ..BPBPBBPIBPBP..
           time ———>
PTSes     .. B  P  B  P  B  B  P  I  B  P  B  P .. <== physical order
             _¦_/  _¦_/  _¦__¦_/  /  _¦_/  _¦_/
            / ¦   / ¦   / ¦  ¦   /  / ¦   / ¦
DTSes     P  B  P  B  P  B  B  I  P  B  P  B
                               /
                          vdcut

A closed GOP with physical order: ..PBPBPBBIPBPB..
          time ———>
PTSes        B  P  B  P  B  B  P  I  B  P  B  P
             _¦_/  _¦_/  _¦__¦_/  /  _¦_/  _¦_/
            / ¦   / ¦   / ¦  ¦   /  / ¦   / ¦
DTSes  .. P  B  P  B  P  B  B  I  P  B  P  B .. <== physical order
                               /
                          vdcut

An open GOP with physical order: ..BPBPBBIBPBP..
                                  vpcut
           time ———>             /
PTSes     .. B  P  B  P  B  B  I  B  P  B  P .. <== physical order
             _¦_/  _¦_/  _¦__¦_/  _¦_/  _¦_/
            / ¦   / ¦   / ¦  ¦   / ¦   / ¦
DTSes     P  B  P  B  I  B  B  P  B  P  B
                      / \______/
                 vdcut   drop open Bs found in vdcut..vpcut

An open GOP with physical order: ..PBPBIBBPBPB..
                                  vpcut
           time ———>             /
PTSes        B  P  B  P  B  B  I  B  P  B  P
             _¦_/  _¦_/  _¦__¦_/  _¦_/  _¦_/
            / ¦   / ¦   / ¦  ¦   / ¦   / ¦
DTSes  .. P  B  P  B  I  B  B  P  B  P  B .. <== physical order
                      / \______/
                 vdcut   drop open Bs found in vdcut..vpcut


The following are parses of actual, professional videos with references to H.222 & H.262 (where the 
format documentation can be found). There's an I-frame followed by a P-frame followed by a B-frame 
followed by a GOP header. Note 'PTS_DTS_flags' 'picture_coding_type' 'closed_gop' & 'broken_link'.


Here is an I-frame
------------------------------|
Re: H.222 §2.4.3.6            | PES_HEADER________________
PES video stream 0            | 00 00 01 E0 == == == == ==
PES_packet_length             | == == == == 07 EC == == ==   //= 2028 bytes
PES Header data content flags | == == == == == == 81 C0 ==
                               |            .——————'   '——————.
marker bits                   |            10-- ---- ---- ----
PES_scrambling_control        |            --00 ---- ---- ----
PES_priority                  |            ---- 0--- ---- ----
data_alignment_indicator      |            ---- -0-- ---- ----
copyright                     |            ---- --0- ---- ----   //1 = copyrighted
original_or_copy              |            ---- ---1 ---- ----   //1 = original; 0 = copy
PTS_DTS_flags                 |            ---- ---- 11-- ----   //00 = PTS DTS don't exist, 01 = 
illegal, 10 = PTS only, 11 = PTS & DTS
ESCR_flag                     |            ---- ---- --0- ----   //elementary stream clock reference
ES_rate_flag                  |            ---- ---- ---0 ----
DSM_trick_mode_flag           |            ---- ---- ---- 0---   //Digital Storage Media trick mode
additional_copy_info_flag     |            ---- ---- ---- -0--
PES_CRC_flag                  |            ---- ---- ---- --0-
PES_extension_flag            |            ---- ---- ---- ---0   //<-- error; '1': pes extension present
PES_header_data_length        | == == == == == == == == 0A       //= 10 bytes
------------------------------|
Re: H.262 §6.2.3              | PICTURE_HEADER_________
picture_start_code            | 00 00 01 00 == == == ==
                               | == == == == 00 8F FF F8
                               |   .—————————'         '—————————————————.
temporal_reference            |   0000 0000 10-- ---- ---- ---- ---- ----
picture_coding_type           |   ---- ---- --00 1--- ---- ---- ---- ----   //'001': I-'picture'; 
'010': P-'picture'; '011': B-'picture'
vbv_delay                     |   ---- ---- ---- -111 1111 1111 1111 1---   //the number of 90 KHz 
clocks from picture_start_code to decoder start (or '111 1111 1111 1111 1' for no delay)
extra_bit_picture             |   ---- ---- ---- ---- ---- ---- ---- -0--   //'1': followed by 8 
bits of content_description_data followed by another extra_bit_picture
padding_bits                  |   ---- ---- ---- ---- ---- ---- ---- --00
------------------------------|


Here is a P-frame
------------------------------|
stnsoft.com/DVD/pes-hdr.html  | PES_HEADER________________
video_stream_0                | 00 00 01 E0 == == == == ==
PES_packet_length             | == == == == 07 EC == == ==   //= 2028 bytes (i.e. [0014..0800])
PES Header data content flags | == == == == == == 81 00 ==
                               |            .——————'   '——————.
marker bits                   |            10-- ---- ---- ----
PES_scrambling_control        |            --00 ---- ---- ----
PES_priority                  |            ---- 0--- ---- ----
data_alignment_indicator      |            ---- -0-- ---- ----
copyright                     |            ---- --0- ---- ----
original_or_copy              |            ---- ---1 ---- ----
PTS_DTS_flags                 |            ---- ---- 00-- ----
ESCR_flag                     |            ---- ---- --0- ----   //elementary stream clock reference
ES_rate_flag                  |            ---- ---- ---0 ----
DSM_trick_mode_flag           |            ---- ---- ---- 0---
additional_copy_info_flag     |            ---- ---- ---- -0--
PES_CRC_flag                  |            ---- ---- ---- --0-
PES_extension_flag            |            ---- ---- ---- ---0   //1 = pes extension present
PES_header_data_length        | == == == == == == == == 00       //= zero bytes
------------------------------|
   Re: H.262 §6.2.3            | PICTURE_HEADER_____________________
picture_start_code            | 00 00 01 00 == == == == == == == ==
                               | == == == == 00 57 FF FB 80 == == ==
                               |  .——————————'            '———————————————————————.
temporal_reference            |  0000 0000 01-- ---- ---- ---- ---- ---- ---- ----
picture_coding_type           |  ---- ---- --01 0--- ---- ---- ---- ---- ---- ----   //'001': 
I-picture; '010': P-picture; '011': B-picture
vbv_delay                     |  ---- ---- ---- -111 1111 1111 1111 1--- ---- ----   //= no delay; 
else, the number of 90 KHz clocks from picture_start_code to decoder start
full_pel_forward_vector       |  ---- ---- ---- ---- ---- ---- ---- -0-- ---- ----   //ignored
forward_f_code                |  ---- ---- ---- ---- ---- ---- ---- --11 1--- ----   //ignored
full_pel_backward_vector      |  ---- ---- ---- ---- ---- ---- ---- ---- -0-- ----   //ignored
backward_f_code               |  ---- ---- ---- ---- ---- ---- ---- ---- --00 0---   //ignored
extra_bit_picture             |  ---- ---- ---- ---- ---- ---- ---- ---- ---- -0--   //= no 
content_description_data; '1' = followed by 8 bits of content_description_data followed by another 
extra_bit_picture
padding_bits                  |  ---- ---- ---- ---- ---- ---- ---- ---- ---- --00
padding bytes                 | == == == == == == == == == 00 00 00
------------------------------|


Here is a B-frame
------------------------------|
stnsoft.com/DVD/pes-hdr.html  | PES_HEADER________________
video_stream_0                | 00 00 01 E0 == == == == ==
PES_packet_length             | == == == == 07 EC == == ==   //= 2028 bytes (i.e. [0014..0800])
PES Header data content flags | == == == == == == 81 00 ==
                               |            .——————'   '——————.
marker bits                   |            10-- ---- ---- ----
PES_scrambling_control        |            --00 ---- ---- ----
PES_priority                  |            ---- 0--- ---- ----
data_alignment_indicator      |            ---- -0-- ---- ----
copyright                     |            ---- --0- ---- ----
original_or_copy              |            ---- ---1 ---- ----
PTS_DTS_flags                 |            ---- ---- 00-- ----
ESCR_flag                     |            ---- ---- --0- ----   //elementary stream clock reference
ES_rate_flag                  |            ---- ---- ---0 ----
DSM_trick_mode_flag           |            ---- ---- ---- 0---
additional_copy_info_flag     |            ---- ---- ---- -0--
PES_CRC_flag                  |            ---- ---- ---- --0-
PES_extension_flag            |            ---- ---- ---- ---0   //1 = pes extension present
PES_header_data_length        | == == == == == == == == 00       //= zero bytes
------------------------------|
   Re: H.262 §6.2.3            | PICTURE_HEADER_____________________
                               | 00 00 01 00 00 9F FF FB B8 00 00 00
picture_start_code            | 00 00 01 00 == == == == == == == ==
                               | == == == == 00 9F FF FB B8 == == ==
                               |  .——————————'            '———————————————————————.
temporal_reference            |  0000 0000 10-- ---- ---- ---- ---- ---- ---- ----
picture_coding_type           |  ---- ---- --01 1--- ---- ---- ---- ---- ---- ----   //'001': 
I-picture; '010': P-picture; '011': B-picture
vbv_delay                     |  ---- ---- ---- -111 1111 1111 1111 1--- ---- ----   //= no delay; 
else, the number of 90 KHz clocks from picture_start_code to decoder start
full_pel_forward_vector       |  ---- ---- ---- ---- ---- ---- ---- -0-- ---- ----   //ignored
forward_f_code                |  ---- ---- ---- ---- ---- ---- ---- --11 1--- ----   //ignored
full_pel_backward_vector      |  ---- ---- ---- ---- ---- ---- ---- ---- -0-- ----   //ignored
backward_f_code               |  ---- ---- ---- ---- ---- ---- ---- ---- --11 1---   //ignored
extra_bit_picture             |  ---- ---- ---- ---- ---- ---- ---- ---- ---- -0--   //= no 
content_description_data; '1' = followed by 8 bits of content_description_data followed by another 
extra_bit_picture
padding_bits                  |  ---- ---- ---- ---- ---- ---- ---- ---- ---- --00
padding bytes                 | == == == == == == == == == 00 00 00
------------------------------|


Here is a GOP header with 'closed_gop' & 'broken_link'
------------------------------|
Re: H.262 §6.2.2.6 & §6.3.8   | GOP_HEADER_____________
group_start_code              | 00 00 01 B8 == == == ==
time_code                     | == == == == 00 08 00 00   //Re: H.262 §6.3.8
                               |   .—————————'         '—————————————————.
drop_frame_flag               |   0--- ---- ---- ---- ---- ---- ---- ----
time_code_hours               |   -000 00-- ---- ---- ---- ---- ---- ----   //= 0
time_code_minutes             |   ---- --00 0000 ---- ---- ---- ---- ----   //= 0
marker_bit                    |   ---- ---- ---- 1--- ---- ---- ---- ----
time_code_seconds             |   ---- ---- ---- -000 000- ---- ---- ----   //= 0
time_code_pictures            |   ---- ---- ---- ---- ---0 0000 0--- ----   //= 0; time code = 
00:00:00.00
closed_gop                    |   ---- ---- ---- ---- ---- ---- -0-- ----
broken_link                   |   ---- ---- ---- ---- ---- ---- --0- ----   //'1': GOP has been 
edited and B-frames have lost their reference frames
?????                         |   ---- ---- ---- ---- ---- ---- ---0 0000


FFmpeg appears to be making up DTSes & PTSes for P- & B-frames. I can see why it wants to do that, 
but the logic of how it makes up DTSes & PTSes is unknown and is suspect. Also, why FFmpeg is not 
using the 'closed_gop' & 'broken_link' flags to fix open B-frames is unknown.

--Mark.



More information about the ffmpeg-user mailing list