[FFmpeg-user] An object oriented video notation, 1 & 2 of 11

Mark Filipak (ffmpeg) markfilipak at bog.us
Mon Jan 31 21:39:55 EET 2022

On 2022-01-31 03:38, Michael Koch wrote:
> Am 31.01.2022 um 05:03 schrieb Mark Filipak (ffmpeg):
>> During the past year, I've developed a set of true stream primitives that are object oriented 
>> (frames, pictures, ...), plus a notation to describe encodings and "mechanical" manipulations that 
>> are useful and that are fairly easy to read and understand. Can the notation be used as a frontend 
>> to FFmpeg? I don't know ...maybe.
>> I have written 11 HTML documents to introduce the notation and to present some novel use-cases.
>>  1, Preface (8 KiB)
>>  2, Teasers (20 KiB)
>>  3, Enhanced Terminology (25 KiB)
>>  4, Reference (19 KiB)
>>  5, Encoding Of DVD & Bluray Content (18 KiB)
>>  6, About Audio (9 KiB)
>>  7, Recovering The Camera Shots (3 KiB)
>>  8, Basic Primitives (21 KiB)
>>  9, Pulldown Primitives (14 KiB)
>> 10, Advanced Interpolations (12 KiB)
>> 11, Seen In The Wild (11 KiB)
>> Writing the docs took a long time and lot of work. Is there any appetite here for them?
> I'd like to read it.
> Michael

Docs 1 & 2 of 11 are attached. Reading time: about 1/2 hour. "Teasers" (especially) is all you 
probably will need to read but I'll follow it with the rest in the next few minutes.

Motivation: I found that not even the pros on the doom9 forum could adequately describe video 
structures in discussions. They related what they did to videos in terms of the DaVinci and Final 
Cut controls they used -- unknown to me -- but they could not explain what the controls actually did 
or how the video structures were actually manipulated.

I saw a need for a structure notation that is concise, unambiguous, and easy to read. That expanded 
from a notation of what a video's structure is to further notations -- notations as operators -- to 
turn "what is" into "what is desired". The emphasis is on in-stream (not splits), and the 
concatenation of primitive operations, and treating structures as objects. If there is to be some 
future tool that utilizes the notations as 'instructions' to interpret, well, that will have to be 
left to the future.

My Method: Start with a notation of what came out of the camera and is encoded on disc, "Encoding Of 
DVD & Bluray Content", and proceed from there to "Recovering The Camera Shots" and beyond.

Tip: Pay attention to "(look)(firstseen)(seen)(finalseen)(unseen)" notations and the factors x1.001, 
x/1.001, x0.96, and x/0.96.

Thanks for reading my stuff. I eagerly look forward to your thoughts.

Donald Trump: Good King or Red Queen?

More information about the ffmpeg-user mailing list