[FFmpeg-devel] [PATCH v2] doc/filters.texi: complete rewrite of fps filter doc, v2.

Sun May 3 01:32:52 EEST 2020

On 2020-05-01 02:52, Carl Eugen Hoyos wrote:
> Am Fr., 1. Mai 2020 um 06:05 Uhr schrieb <list+ffmpeg-dev at jdlh.com>:
>> From: Jim DeLaHunt <from+ffmpeg-dev at jdlh.com>
>>
>> Fix unclear wording and spelling mistakes based on review.
>> Reduce overall word count by 11%.
> This is heavily misleading and definitely not ok.

Oh, I get it! Are you saying that you expect the commit message to be a 
summary of the change between the patch and the current state of the 
master branch it modifies? That's a reasonable expectation, but it 
wasn't what I had in mind when I wrote the commit message.

Yes, I can certainly write a better commit message.

>> Ready for patch review.
>>
>> No other docs, and no executable code, are changed.
>>
>> Signed-off-by: Jim DeLaHunt <from+ffmpeg-dev at jdlh.com>
>> ---
>>   doc/filters.texi | 157 +++++++++++++++++++++++++++++++++++++++++------
>>   1 file changed, 138 insertions(+), 19 deletions(-)
>>
>> diff --git a/doc/filters.texi b/doc/filters.texi
>> index d19fd346ae..0d7d15c448 100644
>> --- a/doc/filters.texi
>> +++ b/doc/filters.texi
>> @@ -11194,25 +11194,30 @@ format=pix_fmts=yuv420p|yuv444p|yuv410p
>>   @anchor{fps}
>>   @section fps
>> -Convert the video to specified constant frame rate by duplicating or dropping
>> -frames as necessary.
>> +Make a new video from the frames and presentation time stamps (PTSs) of the
> While I am not a native speaker, the idea that the fps filter "makes a
> new video"
> is absurd imo (independently of what I may have written in the past, the
> documentation has higher standards than the user mailing list).

Well, I don't feel strongly about the wording "make a new video", even 
though I think it is accurate. If you are really attached to the word 
"convert", how about:

    Convert the video by giving it a specified constant frame rate, by
    keeping frames generally — though repeating or dropping some frames
    as necessary — and by giving it new PTSs. You can choose the method
    for rounding from input PTS to output PTS. This affects which frames
    @var{fps} keeps, repeats, or drops.

(The philosophical question is: is a copy with modifications the same 
thing as the original, or a new thing? Are the buildings of the Ise 
Grand Shrine 
<https://blog.longnow.org/02013/10/03/alexander-rose-visits-ise-shrine-reconstruction-ceremony/>, 
which are rebuilt every 20 years, 2000 years old, or 7 years old?)

>> +input. The new video has a specified constant frame rate, and new PTSs. It
>> +generally keeps frames from the old video, but might repeat or drop some frames.
>> +You can choose the method for rounding from input PTS to output PTS. This
>> +affects which frames @var{fps} keeps, repeats, or drops.
>>
>>   It accepts the following parameters:
>>   @table @option
>>
>>   @item fps
>> -The desired output frame rate. The default is @code{25}.
>> +The output frame rate, in frames per second. May be an integer, real, or
>> +rational number, or an abbreviation. The default is @code{25}.
> I seriously wonder who really profits from this change but it may be ok.

Who profits from this change is someone who who wants to use FFmpeg to 
change the frame rate of a video, and what value they can plug into the 
*fps* parameter, without having to become an FFmpeg developer, or to 
read the source code.

Remember, most users of FFmpeg do not know as much about FFmpeg as you 
core developers do. They aren't stupid or lazy, they have just not 
chosen to study FFmpeg's source code as extensively as you have.

>>   @item start_time
>> -Assume the first PTS should be the given value, in seconds. This allows for
>> -padding/trimming at the start of stream. By default, no assumption is made
>> -about the first frame's expected PTS, so no padding or trimming is done.
>> -For example, this could be set to 0 to pad the beginning with duplicates of
>> -the first frame if a video stream starts after the audio stream or to trim any
>> -frames with a negative PTS.
>> +A time, in seconds from the start of the input stream
> I don't think this is correct, somebody has to confirm this before the change
> is acceptable.

I think this is one of the two high-level objections I am hearing to 
this patch. The new doc text says something different than the old text 
did. I claim that the new text describes the behaviour of the source 
code correctly, and that the old text was wrong to some extent. The only 
way I can see to resolve this is for someone you trust to read the 
source code, and evaluate which of the old text or the new text is more 
accurate.

Anyone volunteer to read the vf_fps.c and referee this objection?

> Since it contradicts the text above, this should be explicitely mentioned in
> the commit message (if correct).
Agreed. I understand now what you expect from the commit message, and I 
can include a something about each old paragraph which I replace with a 
correction.
>> , which @var{fps} converts
>> +to an input starting PTS and an output starting PTS. If set,
>> + at var{fps} drops input frames which have PTSs less than the input starting
>> +PTS. If not set, the input and output starting PTSs are zero, but
>> + at var{fps} drops no input frames based on PTS. (See details below.)
>>
>>   @item round
>> -Timestamp (PTS) rounding method.
>> +Rounding method to use when calculating output PTSs from input PTSs.
>> +If the calculated output PTS is not exactly an integer, then the value
>> +determines which neighbouring integer value @var{fps} selects.
>>
>>   Possible values are:
>>   @table @option
>> @@ -11225,43 +11230,157 @@ round towards -infinity
>>   @item up
>>   round towards +infinity
>>   @item near
>> -round to nearest
>> +round to nearest (midpoints round away from 0)
>>   @end table
>>   The default is @code{near}.
>>
>>   @item eof_action
>> -Action performed when reading the last frame.
>> +Action which @var{fps} takes with the final input frame. The input video passes
>> +in a final input PTS, which @var{fps} converts to an output PTS limit.
>> + at var{fps} drops any input frames with a PTS at or after this limit.
> This sounds wrong to me: In which situation are frames dropped?

As I said before, take a look: ffmpeg/libavfilter/vf_fps.c:234-245 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L234-L245>, 
within write_frame(). Here is the code:

     /* There are two conditions where we want to drop a frame:

      * - If we have two buffered frames and the second frame is acceptable

      *   as the next output frame, then drop the first buffered frame.

      * - If we have status (EOF) set, drop frames when we hit the

      *   status timestamp. */

     if ((s->frames_count == 2 && s->frames[1]->pts <= s->next_pts) ||

         (s->status            && s->status_pts     <= s->next_pts)) {

         frame = shift_frame(ctx, s);

         av_frame_free(&frame);

         *again = 1;

         return 0;

When the input pad declares end of stream, it provides an input PTS 
which is the ending PTS, and a non-zero status code for *s->status*. The 
*fps* filter sets *s->status_pts* to an output PTS value converted from 
the ending input PTS based on the rules of /eof_action/ and /start_time/ 
. As a loop invariant, *s->next_pts* is the output PTS value which the 
*fps* filter intends to assign to the next frame it outputs. As it 
outputs the frame, the *fps* filter increments *s->next_pts*. See 
ffmpeg/libavfilter/vf_fps.c:254 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L254>

frame->pts  = s->next_pts++;

Does this explain to you the situations in which frames are dropped?  
Especially for the next doc text, "@var{fps} drops any input frames with 
a PTS at or after this limit."?

>>   Possible values are:
>>   @table @option
>>   @item round
>> -Use same timestamp rounding method as used for other frames.
>> +Use same rounding method as for other frames.
>>   @item pass
>> -Pass through last frame if input duration has not been reached yet.
>> +Round the ending input PTS using @code{up}. This might make @ref{fps} include
>> +one last input frame.
>>   @end table
> This seems to contradict the original description. Is this intended?

Yes, intended. The code I included above is what "might make @ref{fps} 
include one last input frame." I find that code very hard to understand, 
however. It doesn't clarify how many times the app calls *activate()* to 
cycle the final frames through the filter. It doesn't clarify the values 
of *s->status_pts* and *s->next_pts* during those calls to *activate()*. 
It really does not explain how it decides "if input duration has not 
been reached yet".

So, I'm pretty confident that the old text is not helpful as an 
explanation of how the code behaves. I am confident the new text is more 
helpful. I am less confident that the new text is completely correct.  I 
believe that even incremental improvements are helpful, even if they 
aren't perfect.

>> +
>>   The default is @code{round}.
>>
>>   @end table
>>
>> -Alternatively, the options can be specified as a flat string:
>> +Alternatively, the options may be specified as a flat string:
>>   @var{fps}[:@var{start_time}[:@var{round}]].
>>
>> -See also the @ref{setpts} filter.
>> + at var{fps} makes an output video with consecutive integer PTSs, and with a
>> +time base set to the inverse of the given frame rate. @var{fps} keeps, repeats,
>> +or drops input frames, in sequence, to the output video. It does so according
>> +to their input PTSs, as converted to seconds (via the input time base),
> Why would the fps filter convert the PTS to seconds?
> And how is this relevant for users of the filter?

Well, firstly, the new doc text explains what the code does. The code 
does multiply an input PTS by input time_base, which gives a value in 
seconds. Then it divides by output time_base (which is set to 1/fps 
parameter), giving an output PTS. See 
ffmpeg/libavfilter/vf_fps.c:199-201 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L199-L201> 
<https://github.com/FFmpeg/FFmpeg/blob/1128aa875367f66ac11adc30364d5652919a2591/libavfilter/vf_fps.c#L199-L201>:

     frame->pts = s->out_pts_off + av_rescale_q_rnd(in_pts - s->in_pts_off,
                                                    inlink->time_base, outlink->time_base,
                                                    s->rounding | AV_ROUND_PASS_MINMAX);

And this is relevant for any user who is trying to figure out which 
frames the *fps* filter will repeat or drop, especially when a) the 
frame rate in the *fps* parameter is different from the input frame 
rate, or b) the input PTSs do not advance by constant 1/frame_rate 
increments.

This paragraph is new. The old doc text was simply silent about this 
part of how the filter behaves. I want to fill in that gap in the 
documentation.

>> +then rounded to output PTSs.
> In which situation does fps round?

fps does not round. The seconds, resulting from converting the input 
PTS, round as they convert to output PTSs.

The rounding happens in any situation where the rational number 
(/input_time_base/ / /output_time_base/) is not an integer, so in any 
case where /input_time_base/ is different from /k/ * /output_time_base/ 
for some integer /k/. See the expression from lines 199-201 (above). In 
these cases, the raw computation /input_pts/ * (/input_time_base/ / 
/output_time_base/) will give a real result with a non-zero fractional 
part, and that will require rounding the /output_pts/.

>> + at var{fps} sets output PTSs in terms of a timeline which starts at
>> +zero. For any output frame, the integer PTS multiplied by the time base
>> +gives a value in seconds on that timeline. If the @var{start_time}
>> +parameter is not set, or is zero, the first output frame's PTS is zero.
>> +Otherwise, the first PTS is the output starting PTS calculated
>> +from the @var{start_time} parameter.
>> +
>> + at var{fps} interprets input PTSs in terms of the same timeline. It
>> +multiplies each input frame's PTS by the input time base, to get a value
>> +in seconds on the timeline. It rounds that value to an integer output PTS.
>> +For example, if the input video has a frame rate of 30 fps, a time base
>> +of 1/30 seconds, and a first frame with a PTS of 300, then @var{fps} treats that
>> +first frame as occurring 10 seconds (= 300 * 1/30) after the start of the video,
>> +even though it is the first frame.
>> +
>> +Setting a @code{start_time} value allows for padding/trimming at the
>> +start of the input. For example, you can set @code{start_time} to 0, to pad the
>> +beginning with repeats of the first frame if a video stream starts after
>> +the audio stream, or to trim any frames with a negative PTS. When
>> + at code{start_time} is not set, the @var{fps} filter does not pad or trim
>> +starting frames, as long as they contain PTSs.
>> +
>> +See also the @ref{setpts} and @var{settb} filters.
>> +
>> + at subsection Details
>> +
>> + at var{fps} outputs exactly one frame for each output PTS. If there is
>> +exactly one input frame with an input PTS which converts to the current output
>> +PTS, @var{fps} keeps (outputs) that frame. If there are multiple frames which
>> +convert to the same output PTS, @var{fps} outputs the final frame of that
>> +group, and drops the previous frames.
>> +
>> +If the input frame PTS converts to an output PTS later than the current
>> +output PTS, @var{fps} repeats the previously output frame as the current frame.
>> +When this happens for the first input frame,  @var{fps} "pads" — outputs
>> +repetitions of — that first frame until the output PTS reaches the value
>> +converted from that first frame's input PTS.
>> +
>> + at var{fps} always drops input frames which have no PTS set, regardless
>> +of the @var{start_time} parameter.
>> +
>> +The @var{frame rate} value must be zero or greater. It may be provided in a
>> +variety of forms. Each form is converted into a rational number, with an
>> +integer numerator and denominator.
>> +
>> + at itemize
>> +
>> + at item
>> +An integer number, e.g. @code{25}. This converts to the rational number
>> + at code{25/1}.
>> +
>> + at item
>> +A real number, e.g. @code{3.14145926}. This converts to a rational number,
>> +e.g. @code{954708/303893}
>> +
>> + at item
>> +A rational number. The numerator and denominator may be either integers or real
>> +numbers. e.g. @code{30/1.001} or @code{-30000/-1001}, which both convert to
>> + at code{30000/1001}. The denominator must be non-zero.
>> +
>> + at item
>> +An abbreviation. e.g @code{ntsc} as @code{30000/1001},
>> + at code{ntsc-film} as @code{24000/1001}. See the complete list at
>> + at ref{Video rate,,the "Video rate" section in the ffmpeg-utils(1) manual,ffmpeg-utils}.
>> +
>> + at end itemize
>> +
>> + at var{fps} defines a sync point on the timeline, where one input PTS and one
>> +output PTS occur at the same moment. It calculates other PTSs as time offsets
>> +from this sync point. This affects the details of rounding. If @var{start_time}
>> +is set, then @var{fps} uses it to calculate input and output PTSs, and makes
>> +them the sync point. Otherwise, input and output PTS of zero are the sync point.
>> +
>> +Note that @var{fps} does not assume that input frames are separated by exactly
>> +1/frame_rate seconds. It takes the input PTSs literally. If the increment
>> +of PTS between frames varies along the video, @var{fps} treats those frames as
>> +happening at varying time intervals.
>> +
>> +An input video with PTSs starting past zero might yield unexpected results.
>> +Suppose the input PTSs start at 300, and say that converts to 10 seconds.
>> +Then @var{fps} repeats the first frame to fill the first 10 seconds of the
>> +output video. (However, @command{ffmpeg} may suppress those repeated frames,
>> +depending on the @option{-vsync} setting.) If you set @var{start_time} to 10
>> +seconds, then @var{fps} sets the sync point to the PTSs converted from
>> +10 seconds on the timeline. It no longer repeats the first frame. And, it starts
>> +the output PTS at a value corresponding to 10 seconds, instead of zero.
>>
>>   @subsection Examples
>>
>>   @itemize
>>   @item
>> -A typical usage in order to set the fps to 25:
>> +A typical usage to make a video with a frame rate of 25 frames per
>> +second, from an input video with any frame rate:
>>   @example
>>   fps=fps=25
>>   @end example
>> -
>> +The output frames have PTSs of 0, 1, 2, etc. The frame rate is @code{25/1}. The
>> +time base is @code{1/25}.
>>   @item
>> -Sets the fps to 24, using abbreviation and rounding method to round to nearest:
>> +Make a video with a frame rate of 24 frames per second, using an abbreviation,
>> +and rounding method to round to nearest:
>>   @example
>>   fps=fps=film:round=near
>>   @end example
>> +
>> + at item
>> +Clean up a video with varying time between frames, and dropped frames.
>> +The input video is supposed to have an NTSC standard frame rate of 29.97 frames
>> +per second, and the time base is @code{3003/90000}, but the PTSs increment
>> +variably at slightly more and less than that rate. The recorder dropped some
>> +frames, but the PTSs still reflect when the remaining frames were captured.
>> + at example
>> +fps=fps=30/1.001:round=near
>> + at end example
>> +The output PTSs are 0, 1001, 2002, etc. The time between frames is exact.
>> +The output frame rate is @code{30000/1001}. The time base is @code{1001/30000}.
>> +Where frames were dropped by the recorder, @var{fps} repeated frames to fill
>> +the gaps.
> This sections are still far too long.

And I think this is second of the two high-level objections I am hearing 
to this patch. I am hearing that you think the cost of too many words in 
the filter doc is greater than the benefit of having the filter clearly 
explained.

All the paragraphs you skipped over, except for the first two examples, 
are new explanations which were missing from the old doc text. I am not 
being paid by the word; I included those new explanations because they 
were important things which are necessary to understand what the *fps* 
filter actually does.

If you want doc which explains FFmpeg properly, to users who aren't 
already FFmpeg core developers, I think you will inevitably have more 
words in the doc. So I interpret this second object to be, it's not 
important to explain FFmpeg**properly except to people who also read the 
ffmpeg/ code top to bottom. I think that is disappointing.

> The patch is not ok.

Carl Eugen, I watch you on the -devel and the -users lists. You are 
tireless in responding to message after message. You have read through 
both my patches to the *fps* doc, and you have given me more detailed 
feedback than anyone else on the list. I appreciate that. I can't 
imagine that you opened those patch emails with joy. I hope that your 
extensive work on these lists at least gives you satisfaction, if not joy.

Nevertheless, I disagree with you about the two high-level objections to 
this patch. I don't know how to move forward. Maybe it will take others 
to weigh in on the value of documentation for regular users, and to 
referee reading the code of *vf_fps.c *against the old and new doc text.

Your point that the commit message should be rewritten, I completely 
accept. Your other comments about wording, like "make" versus "convert" 
a video, I think are completely fixable.

If I can see a way to get past the two high-level objections, I am happy 
to work on a revised patch correcting the other problems.

> Carl Eugen
Best regards,

      —Jim DeLaHunt, software engineer, Vancouver, Canada