[FFmpeg-user] Why are PTS values different from what's expected?

Fri Apr 2 04:13:09 EEST 2021

Mark Filipak (ffmpeg) wrote
> On 2021-04-01 13:40, pdr0 wrote:
>> 
>> This zip file example has the original 24000/1001, weighted frame
>> blending
>> to 120000/1001, and decimation to 60000/1001 - is this something close to
>> what you had in mind ?
>> https://www.mediafire.com/file/qj819m3vctx4o4q/blends_example.zip/file
> 
> Thanks for that (...I wish I knew how you are making those...).
> convertfps_119.88(blends).mp4 actually looks to be the better choice for
> my 60Hz TV -- the TV is 
> interpolating well -- but I think the weighting could be tweaked (which is
> something I planned to do 
> once my filter complex was working properly).

I'm using avisynth ,because it has preset functions for everything. e.g.
ConvertFPS does blended framerate conversions. It's 1 line.  ie. You don't
have to break it down into select, merge, interleave - but I'll post a
vapoursynth template for you to reproduce it , and you can experiment with
different weights - since you've used vapoursynth before, and you won't be
bothered by PTS issues or frames out of order

On a 60Hz display, those two files should look identical from the way it was
made. The 60Hz display only displays every 2nd frame on the 120000/1001 fps
video. 

Mark Filipak (ffmpeg) wrote
> On 2021-04-01 13:40, pdr0 wrote:
>> Mark Filipak (ffmpeg) wrote
>>> What I'm trying to do is make a 120000/1001fps cfr in which each frame
>>> is
>>> a proportionally weighted
>>> pixel mix of the 24 picture-per-second original:
>>> AAAAA AAAAB AAABB AABBB ABBBB.
>>> I'm sure it would be way better than standard telecine -- zero judder --
>>> and I'm pretty sure it
>>> would be so close to motion vector interpolation that any difference
>>> would
>>> be imperceptible. I'm
>>> also sure that it would be a much faster process than mvinterpolate. The
>>> only question would be
>>> resulting file size (though I think they would be very close).
>> 
>> Is this 120000/1001 CFR intended for a 60Hz display? ...
> 
> Yes, or a 120Hz display.
> 
> Please, correct me if I'm wrong.
> 
> The 120fps frames are conveying 24 pictures-per-second, i.e. 5 discrete
> frames per picture with the 
> 1st of each set of 5 being an identical duplicate of the original (e.g.
> AAAAA), so down converting 
> to 60fps is not a simple case of dropping alternate frames (i.e. 5/2 is
> not an integer).
> The lines below are: 24FPS / 120FPS / 60FPS (BY ALTERNATE FRAME DROP).
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAA.BBBBBBBBBBBBBBBBBBBBBBBBBBBBB.CCCCCCCCCCCCCCCCCCCCCCCCCCCCC.DDDDD...
> AAAAA.AAAAB.AAABB.AABBB.ABBBB.BBBBB.BBBBC.BBBCC.BBCCC.BCCCC.CCCCC.CCCCD.CCCDD.CCDDD.CDDDD.DDDDD...
> AAAAA.AAABB.ABBBB.BBBBC.BBCCC.CCCCC.CCCDD.CDDDD.DDDDE.DDEEE.EEEEE.EEEFF.EFFFF.FFFFG.FFGGG.GGGGG...
> 
> Note how, in the 60FPS stream, there is no BBBBB or DDDDD or FFFFF frames.
> That continues and any 
> loss of 'pure' frames would probably be noticeable. ...UPDATE: The loss is
> noticeable (as flicker).

Looks correct - that's the same thing going on with the
59.94(blends,decimation) file - every 2nd frame is selected (selecteven)

That's also what's going on when you watch the 120000/1001 file on most 60Hz
displays.

> My cheap 60Hz TV accepts 120fps and it apparently interpolates during down
> conversion to 60fps. I 
> assume that's true of all 60Hz TVs because, given that the frames are sent
> to the TV as raw frames 
> via HDMI, doing pixel interpolation in real time within the TV is a snap,
> so all 60Hz TVs probably 
> do it fine.

In what way does it "interpolate ?"

If it's doing anything other than dropping frames, there must be additional
processing in your TV set - and "additional processing" is definitely not
standard for "cheap" sets.  The majority of 60Hz displays will only display
every 2nd frame, no interpolation. 

>  
> 
>> "proportionally weighted pixel mix" it sounds like a standard frame
>> blended
>> conversion . eg. You drop a 23.976p asset on a 120000/1001fps timeline in
>> a
>> NLE and enable frame blending. Or in avisynth it would be
>> ConvertFPS(120000,1001)
> 
> Well, I don't know what "NLE" means, so you'll need to enlighten me. 

NLE is a "non linear editor.", used to edit videos.   Blend conversions are
one type of standardized form of conversion. For example many TV stations
used to perform this sort of blend conversions to/from different framerates
(some still do). It's frowned upon in many circles . Pros/cons

> First, let me redefine AAAAA.AAAAB.AAABB.AABBB.ABBBB.BBBBB as
> 5A0B.4A1B.3A2B.2A3B.1A4B.0A5B.

ie. 5 frame cycles, linear interpolation of weights. This is the same
pattern that was posted in the zip file

Here is a vapoursynth template that (almost) does the same thing, until you
get the PTS's sorted out in ffmpeg. The one difference is an extra frame is
added (for programmatic reasons, the 1st frame is spliced onto the repeating
interleave pattern), but they are otherwise identical in the end result
e.g., frame 0 matches frame 0, frame 100 matches frame 100. etc... 

import vapoursynth as vs
core = vs.get_core()
clip = core.lsmas.LibavSMASHSource(r'23.976p.mp4')

#create clip offset by 1 frame (trim off first frame)
offset1 = core.std.Trim(clip, first=1)

#weighting
Aweight = core.std.Merge(clip, offset1,  weight=0) #not used in this example
Bweight = core.std.Merge(clip, offset1,  weight=0.2)
Cweight = core.std.Merge(clip, offset1,  weight=0.4)
Dweight = core.std.Merge(clip, offset1,  weight=0.6)
Eweight = core.std.Merge(clip, offset1,  weight=0.8)
Fweight = core.std.Merge(clip, offset1,  weight=1)

#splice first frame plus interleave pattern
interleaved = core.std.Trim(clip, first=0, last=0) +
core.std.Interleave(clips=[Bweight,Cweight,Dweight,Eweight,Fweight])

#assume frame rate (and timestamps)
interleaved = core.std.AssumeFPS(interleaved, fpsnum=120000, fpsden=1001)

selecteven = interleaved[::2] #selecteven

interleaved.set_output()
#selecteven.set_output()

> Then, using that notation, how about these?
> 5A1B.4A2B.3A3B.3A3B.2A5B.1A5B [note 1]
> 9A1B.5A5B.7A3B.3A7B.5A5B.1A9B [note 2]
> 
> [note 1] Might produce some additional fuzzing (but less noticeable
> flicker) for fast motion, but 
> less fuzzing as A & B approach identity (i.e. slower motion).
> 
> [note 2] Might produce a small, 48fps judder in exchange for doubling the
> flicker rate (which might 
> be less noticeable).

You can test them in that  vapoursynth template as proof of concept , at
least until  you get the PTS sorted out in ffmpeg. 

> Another thing I've discovered in the past is that using a checkerboard mix
> in lieu of a blended mix 
> looks much sharper -- perhaps this is a "trick of the eye", human
> perception thing.

Yes I recall in your other thread, I didn't like it personally. 

> I'd like to try all of the above, but first I need to get my filter
> complex working (which means 
> solving its PTS errors (which means discovering the effect on PTS of
> various filters (which means 
> more experimenting because the documentation is silent on the issue))). 

I think that setts bsf filter might help, looks promising

--
Sent from: http://ffmpeg-users.933282.n4.nabble.com/