[FFmpeg-devel] [PATCH 4/4] lavf/mov: Add support for edit list parsing.

Thu Sep 15 23:08:40 EEST 2016

On Wed, Sep 14, 2016 at 1:57 PM, Michael Niedermayer <michael at niedermayer.cc
> wrote:

> On Wed, Sep 14, 2016 at 12:20:52AM -0700, Sasi Inguva wrote:
> > On Tue, Sep 13, 2016 at 4:39 PM, Sasi Inguva <isasi at google.com> wrote:
> >
> > > Sorry for the very late reply. I was busy with other things.
> > >
> > > On Sat, Sep 3, 2016 at 4:48 PM, Michael Niedermayer <
> > > michael at niedermayer.cc> wrote:
> > >
> > >> On Sat, Sep 03, 2016 at 12:06:39PM -0700, Sasi Inguva wrote:
> > >> > Hi Michael,
> > >> >
> > >> > In fact from audacity I see that out-ingu.wav out-mp3.wav are almost
> > >> > equivalent,
> > >>
> > >> They do not match. (and that is alot more vissible if you scale the
> > >> time axis up a bit)
> > >>
> > >> You also dont use the existing API for handling initial padding/skip
> > >> And you didnt reply to this concern
> > >> its probably not that hard to fix that ...
> > >> instead of droping just at packet granularity you would set the stuff
> > >> for droping at sample granularity (too)
> > >>
> > >
> > > Yes. Looking at it more closely now, they don't matrch exactly and
> this is
> > > because as you said, number of samples to drop is not exactly multiple
> of
> > > number of packets. In that case we need to partially discard some
> samples
> > > of a packet. This can be done by using AV_PKT_SIDE_DATA_SKIP_SAMPLES.
> I
> > > can change the code to use this packet side data instead of discard
> packet
> > > flag, if it is ok.
>
> ok
>
> I have put my foot in my mouth here. There is no way I can add
AV_PKT_SIDE_DATA_SKIP_SAMPLES to each packet, unless I add a new field
"skip_samples" in AVIndexEntry and populate it in mov_fix_index function.
Adding another entry to AVIndexEntry seems contentious because we want to
keep the AVIndexEntry as compact as possible, I presume. So what I wil do
for now is just set st->skip_samples field in mov_fix_index function for
the first audio edit list. So at least single edit list audio cases ,will
be correct. In multiple edit list case, the the first edit will be
correctly output but the second audio edit might have some samples extra.
Also note that, there is no mechanism for doing "discard_padding" too, in
MOV demuxer. If we have skip_samples and discard_padding fields in
AVIndexEntry, then we can implement both of them correctly.

With the attached patch the test that you have showed, comparing
out-ingu.wav and out-mp3.wav - are  exactly equivalent.

>
> > >
> > >
> > >>
> > >> also maybe you missed my question about your oppinion on the correct
> > >> timestamp output:
> > >> "My first question is, entirely independant of the implementation from
> > >>  the patches. What is the correct output ? (my primary focus is on
> > >>  the timestamps)"
> > >>
> > >> Iam curious because to me the timestamps of the dropped packets look
> > >> wrong and id like to know your oppinion about this. Is this a
> > >> implementation issue (if so please explain) or is there some reason
> > >> independant of the implementation  (if so please explain too)
> > >>
> > >
> > > The correct output according to me should be - we should show exactly
> the
> > > same presentation timestmap indicated by the MOV stts, ctts atoms, for
> the
> > > samples that fall inside the edits. As long as the PTS is according to
> what
> > > the edit and stts,ctts atoms say. I don't really have to worry about
> what
> > > the decoding timestamp for those packets should be (DTS might as well
> be
> > > AV_NOPTS_VALUE for all packets) .
> > >
> > > In the current implementation, we directly assign the timestamp in the
> > > AVIndex to  pkt->dts . http://git.videolan.org/?p=ffmpeg.git;a=blob;f=
> > > libavformat/mov.c;h=6e80b93271a4f998af6ba1af795d7d
> 7c5d67f5a1;hb=HEAD#l5332
> > > . The implementation of the edit list code is such that, it needs to be
> > > filled with "linear monotonically increasing timestamps for the
> > > non-discarded packets " to have the samples denoted by the editlists to
> > > have linear monotonically increasing timestamps, in essence, to avoid
> a big
> > > gap between the presentation of the end of first edit list and the
> start of
> > > next edit list because of the discarded packets.
> > >
> > Hence while parsing a new edit list we bump down the index entry
> timestamp
> > > to (frame_duration + last non-discarded packet from previous edit list
> ) .
> > > And that's why we get non-monotonic timestamps as a whole in the
> > > AVIndexEntry .
> > >
> > >  I think my wording is confusing here. Just clarifying . For one
> editlist/
> > within one editlist, the AVIndexEntry samples pertaining to that edit,
> need
> > to be filled with "monotonic timestamps which increase in steps of their
> > corresponding 'stts' atom entries ", to achieve the correct presentation
> > timestamp for those samples.
> >
> >  When we start parsing the next edit list entry, and start adding index
> > entries to AVIndexEntry we need to avoid a big gap between the
> presentation
> > of the end of first edit and the start of next edit, that might happen
> > because of the DISCARD packets added to the AVIndexEntry while parsing
> the
> > first edit. Hence we start adding index entries from a  bumped down value
> > corresponding to (frame_duration + last non-discarded packet from
> previous
> > edit ) . And that's why we get non-monotonic timestamps as a whole in the
> > AVIndexEntry .
> >
> >
> > > Let us take an example of a file with two edit lists. First edit list
> > > ending on a B frame. This is what ffprobe -show_packet -pf compact
> looks
> > > like . Where 'D' stands for discarded frame. ( I've attached a 6th
> patch to
> > > ffprobe to achieve this formatting ).
> > > ./ffprobe -show_packets -print_format compact
> mov-2elist-bpyramid-1elist-
> > > ends-on-bref.mov
> > > .....
> > > packet|codec_type=video|stream_index=0|pts=10752|pts_
> > > time=0.875000|dts=8192|dts_time=0.666667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=1281|pos=39546|flags=__ - Pframe
> > > packet|codec_type=video|stream_index=0|pts=9728|pts_
> > > time=0.791667|dts=8704|dts_time=0.708333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=749|pos=40827|flags=__ - B
> pyramidal
> > > ref
> > > packet|codec_type=video|stream_index=0|pts=9216|pts_
> > > time=0.750000|dts=9216|dts_time=0.750000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=487|pos=41576|flags=__ - B
> > > packet|codec_type=video|stream_index=0|pts=10240|pts_
> > > time=0.833333|dts=9728|dts_time=0.791667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=486|pos=42063|flags=__ - B
> > > packet|codec_type=video|stream_index=0|pts=12800|pts_
> > > time=1.041667|dts=10240|dts_time=0.833333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D - P
> > > packet|codec_type=video|stream_index=0|pts=11776|pts_
> > > time=0.958333|dts=10752|dts_time=0.875000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=848|pos=46004|flags=_D - B
> pyramidal
> > > ref
> > > packet|codec_type=video|stream_index=0|pts=11264|pts_
> > > time=0.916667|dts=11264|dts_time=0.916667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=655|pos=46852|flags=__ - B
> > > packet|codec_type=video|stream_index=0|pts=12288|pts_
> > > time=1.000000|dts=11776|dts_time=0.958333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=901|pos=47507|flags=_D - B
> > > packet|codec_type=video|stream_index=0|pts=13312|pts_
> > > time=1.083333|dts=12288|dts_time=1.000000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD  - end 1st
> edit
> > > packet|codec_type=video|stream_index=0|pts=11772|pts_
> > > time=0.958008|dts=10748|dts_time=0.874674|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_ - start 2nd
> > > edit
> > > packet|codec_type=video|stream_index=0|pts=13820|pts_
> > > time=1.124674|dts=11260|dts_time=0.916341|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=1179|pos=53520|flags=__
> > > .....
> > >
> > > The implementation basically increases dts by the amounts that stts
> atom
> > > indicates, and as you can see after the end of the first edit list and
> > > start of the next edit list, it bumps down the dts from 12288 to 10650
> to
> > > make the pts linearly increasing from the last non-discarded frame
> 11264 ->
> > > 11772 .
> > >
> > > As I understand from DTS semantics in ffmpeg we have two constraints on
> > > DTS.
> > > i) DTS should be monotonically increasing  ii) For every packet PTS >=
> DTS
> > > should be true.
> > >
> > >  Just reordering PTS to DTS doesn't work either. For example, here is
> the
> > > DTS  I get from "ffprobe -show_packets " when I do "pkt->dts =
> > > AV_NOPTS_VALUE" for all packets in mov_read_packet function
> > > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/mov.c;h=
> > > 6e80b93271a4f998af6ba1af795d7d7c5d67f5a1;hb=HEAD#l5332  . Ffmpeg
> should
> > > use update_dts_from_pts function to reoder pts to dts -
> > > http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavformat/utils.c;h=
> > > 76cbff4ef67937499a7c113dcaacb88389e6015e;hb=HEAD#l1005 .
> > >
> > > ./ffprobe -show_packets -print_format compact
> mov-2elist-bpyramid-1elist-
> > > ends-on-bref.mov
> > > packet|codec_type=video|stream_index=0|pts=0|pts_time=
> > > 0.000000|dts=N/A|dts_time=N/A|duration=512|duration_time=0.
> > > 041667|convergence_duration=N/A|convergence_duration_time=N/
> > > A|size=5683|pos=36|flags=K_
> > > packet|codec_type=video|stream_index=0|pts=2048|pts_
> > > time=0.166667|dts=N/A|dts_time=N/A|duration=512|
> duration_time=0.041667|
> > > convergence_duration=N/A|convergence_duration_time=N/A|
> > > size=4260|pos=5719|flags=__
> > > packet|codec_type=video|stream_index=0|pts=1024|pts_
> > > time=0.083333|dts=0|dts_time=0.000000|duration=512|
> duration_time=0.041667|
> > > convergence_duration=N/A|convergence_duration_time=N/A|
> > > size=3098|pos=9979|flags=__
> > > packet|codec_type=video|stream_index=0|pts=512|pts_
> > > time=0.041667|dts=512|dts_time=0.041667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=786|pos=13077|flags=__
> > > ....
> > > packet|codec_type=video|stream_index=0|pts=10752|pts_
> > > time=0.875000|dts=8192|dts_time=0.666667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=1281|pos=39546|flags=__
> > > packet|codec_type=video|stream_index=0|pts=9728|pts_
> > > time=0.791667|dts=8704|dts_time=0.708333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=749|pos=40827|flags=__
> > > packet|codec_type=video|stream_index=0|pts=9216|pts_
> > > time=0.750000|dts=9216|dts_time=0.750000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=487|pos=41576|flags=__
> > > packet|codec_type=video|stream_index=0|pts=10240|pts_
> > > time=0.833333|dts=9728|dts_time=0.791667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=486|pos=42063|flags=__
> > > packet|codec_type=video|stream_index=0|pts=12800|pts_
> > > time=1.041667|dts=10240|dts_time=0.833333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=3455|pos=42549|flags=_D
> > > packet|codec_type=video|stream_index=0|pts=11776|pts_
> > > time=0.958333|dts=10752|dts_time=0.875000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=848|pos=46004|flags=_D
> > > packet|codec_type=video|stream_index=0|pts=11264|pts_
> > > time=0.916667|dts=11264|dts_time=0.916667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=655|pos=46852|flags=__
> > > packet|codec_type=video|stream_index=0|pts=12288|pts_
> > > time=1.000000|dts=11776|dts_time=0.958333|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=901|pos=47507|flags=_D
> > > packet|codec_type=video|stream_index=0|pts=13312|pts_
> > > time=1.083333|dts=12288|dts_time=1.000000|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=5112|pos=48408|flags=KD
> > > packet|codec_type=video|stream_index=0|pts=11772|pts_
> > > time=0.958008|dts=11772|dts_time=0.958008|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=5112|pos=48408|flags=K_
> > > packet|codec_type=video|stream_index=0|pts=13820|pts_
> > > time=1.124674|dts=12800|dts_time=1.041667|duration=512|
> > > duration_time=0.041667|convergence_duration=N/A|
> > > convergence_duration_time=N/A|size=1179|pos=53520|flags=__
> > > ....
> > > As we can see the dts are still non monotonic.
> > >
> > > The problem here is that in case of edit lists, we are trying to make
> the
> > > PTS "not grow" when the packets are being discarded. If we look at the
> > > non-discarded packets in presentation order, in the above example the
> PTS
> > > is approx. linearly growing 10752, 11264, 11772, with a difference of
> > > frame_duration=512 approx. However there are 3 discarded frames that
> we are
> > > outputting in between. So, to grow DTS monotonically linearly we need
> to
> > > grow DTS at a higher frame rate i.e. smaller frame duration.
> > >
>
> > > If we want to make DTS  satisfy the two constraints described above , I
> > > can recompute the DTS based on an assumed constant frame rate, and
> shift
> > > the DTS so that PTS > DTS always, but this solution wont work for
> videos
> > > with variable frame rates.
> > > So in short , these are the options available I see for the DTS
> > > i) Keep the change as it is , with non monotonic DTS
> > > ii) Mark DTS=AV_NOPTS_VALUE for discarded frames. Maybe it is right
> > > semantic to have DTS=N/A for some packets. I don't know.
> > > iii) Recompute the DTS as stated above.
>
> I guess i or ii are the least evil
> staying with i and changing if it causes issues is probably least work
>
> Thanks. I will stay with (i) and if there are any problems consider making
the DTS to AV_NOPTS for discarded packets.

>
> > >
> > > Attaching all the 6 patches and the test file.
>
> the mov patch doesnt apply cleanly, 1 patch per mail is preferred as
> well
>
> the ffprobe patch breaks fate
>  Strings Metadata|8
> -video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K|1
> +video|0|37|1.480000|34|1.360000|1|0.040000|N/A|N/A|24786|211456|K_|1
>  Strings Metadata|8
>
> Fixed a lot of ffprobe fate tests for the 6th patch.

>
> also the mov patch breaks fate:
>
> --- ./tests/ref/fate/copy-trac236       2016-09-14 15:56:10.845886847 +0200
> +++ tests/data/fate/copy-trac236        2016-09-14 22:44:44.494403284 +0200
> @@ -1,5 +1,5 @@
> -9b95afdb39b426a33bc962889f820aed *tests/data/fate/copy-trac236.mov
> -630802 tests/data/fate/copy-trac236.mov
> +d6e3d97b522ce881ed29c5da74cc7e63 *tests/data/fate/copy-trac236.mov
> +630810 tests/data/fate/copy-trac236.mov
>  #tb 0: 100/2997
>  #media_type 0: video
>  #codec_id 0: rawvideo
> Test copy-trac236 failed. Look at tests/data/fate/copy-trac236.err for
> details.
> make: *** [fate-copy-trac236] Error 1
>
> either way, new fate samples uploaded
>
> The copy-trac236 test is essentially doing this.
./ffmpeg -f mov -i  fate-suite/mov/fcp_export8-236.mov -codec copy  -map 0
-f mov -y copy-trac236.mov
echo "$(md5 copy-trac236.mov) $(framecrc copy-trac236.mov ) "

If we compare the output mov produced by head and the patch, with patch we
are producing output of correct duration (because we are parsing edit list
in fcp_export8-236.mov and correctly truncating the last packet duration )
. without patch we produce a file 0.03 sec longer.
./ffmpeg_head -f mov -i  fate-suite/mov/fcp_export8-236.mov -codec copy
 -map 0 -f mov -y copy-trac236-head.mov
./ffmpeg_patch -f mov -i  fate-suite/mov/fcp_export8-236.mov -codec copy
 -map 0 -f mov -y copy-trac236-patch.mov

and just compare the input durations
./ffprobe_head fate-suite/mov/fcp_export8-236.mov - 0.14
./ffprobe_head copy-trac236-patch.mov - 0.14
./ffprobe_head copy-trac236-head.mov - 0.17

This is because we correctly truncate the last video packet duration when
parsing edit lists,
diff <(./ffprobe -show_packets -print_format compact fcp_export8-236.mov )
 <(./ffprobe_head -show_packets -print_format compact fcp_export8-236.mov |
sed 's|$|_|g' )
12c12
<
packet|codec_type=video|stream_index=0|pts=400|pts_time=0.133467|dts=400|dts_time=0.133467|duration=31|duration_time=0.010344|convergence_duration=N/A|convergence_duration_time=N/A|size=120000|pos=512836|flags=K_
---
>
packet|codec_type=video|stream_index=0|pts=400|pts_time=0.133467|dts=400|dts_time=0.133467|duration=100|duration_time=0.033367|convergence_duration=N/A|convergence_duration_time=N/A|size=120000|pos=512836|flags=K_

Attaching the files and the patches. I'll mail the patches separately too.

Thanks.

> Thanks
>
> [...]
>
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> No human being will ever know the Truth, for even if they happen to say it
> by chance, they would not even known they had done so. -- Xenophanes
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-avutil-frame-Add-a-flag-to-discard-frame-after-decod.patch
Type: text/x-patch
Size: 1670 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-lavf-mov-Add-support-for-edit-list-parsing.patch
Type: text/x-patch
Size: 118955 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-lavc-Add-a-flag-in-AVPacket-to-discard-packet-after-.patch
Type: text/x-patch
Size: 2549 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-avformat-avframe.h-Add-a-flag-in-AVIndexEntry-to-dis.patch
Type: text/x-patch
Size: 1213 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0006-ffprobe.c-Indicate-decode-but-discard-packets-when-d.patch
Type: text/x-patch
Size: 96472 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-lavf-utils-Support-av_index_search_timestamp-in-case.patch
Type: text/x-patch
Size: 1113 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0005.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copy-trac236-head.mov
Type: video/quicktime
Size: 630835 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment.mov>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: copy-trac236-patch.mov
Type: video/quicktime
Size: 630810 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20160915/f0f11379/attachment-0001.mov>