[FFmpeg-devel] [PATCH v4 1/4] doc: Explain what "context" means

Wed May 22 12:31:52 EEST 2024

Sorry for the slow reply.

On date Wednesday 2024-05-15 16:54:19 +0100, Andrew Sayers wrote:
> Derived from detailed explanations kindly provided by Stefano Sabatini:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
> ---
>  doc/context.md | 394 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 394 insertions(+)
>  create mode 100644 doc/context.md
> 
> diff --git a/doc/context.md b/doc/context.md
> new file mode 100644
> index 0000000000..fb85b3f366
> --- /dev/null
> +++ b/doc/context.md
> @@ -0,0 +1,394 @@
> +# Introduction to contexts
> +
> +“%Context”

Is this style of quoting needed? Especially I'd avoid special markup
to simplify unredendered text reading (which is the point of markdown
afterall).

> is a name for a widely-used programming idiom.

> +This document explains the general idiom and the conventions FFmpeg has built around it.
> +
> +This document uses object-oriented analogies to help readers familiar with
> +[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
> +learn about contexts.  But contexts can also be used outside of OOP,
> +and even in situations where OOP isn't helpful.  So these analogies
> +should only be used as a first step towards understanding contexts.
> +
> +## “Context” as a way to think about code
> +
> +A context is any data structure that is passed to several functions
> +(or several instances of the same function) that all operate on the same entity.
> +For example, [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
> +languages usually provide member functions with a `this` or `self` value:
> +

> +```c
> +class my_cxx_class {
> +  void my_member_function() {
> +    // the implicit object parameter provides context for the member function:
> +    std::cout << this;
> +  }
> +};
> +```

I'm not convinced this is really useful: if you know C++ this is
redundant, if you don't this is confusing and don't add much information.

> +
> +Contexts are a fundamental building block of OOP, but can also be used in procedural code.

I'd drop this line, and drop the anchor on OOP at the same time since
it's adding no much information.

> +For example, most callback functions can be understood to use contexts:

> +
> +```c
> +struct MyStruct {
> +  int counter;
> +};
> +
> +void my_callback( void *my_var_ ) {
> +  // my_var provides context for the callback function:
> +  struct MyStruct *my_var = (struct MyStruct *)my_var_;
> +  printf("Called %d time(s)", ++my_var->counter);
> +}
> +
> +void init() {
> +  struct MyStruct my_var;
> +  my_var.counter = 0;
> +  register_callback( my_callback, &my_var );

style: fun(my_callback, ...) (so spaces around parentheses) here and
below

> +}
> +```
> +
> +In the broadest sense, “context” is just a way to think about code.
> +You can even use it to think about code written by people who have never
> +heard the term, or who would disagree with you about what it means.
> +
> +## “Context” as a tool of communication
> +
> +“%Context“ can just be a word to understand code in your own head,
> +but it can also be a term you use to explain your interfaces.
> +Here is a version of the callback example that makes the context explicit:
> +
> +```c
> +struct CallbackContext {
> +  int counter;
> +};
> +
> +void my_callback( void *ctx_ ) {
> +  // ctx provides context for the callback function:
> +  struct CallbackContext *ctx = (struct CallbackContext *)ctx_;
> +  printf("Called %d time(s)", ++ctx->counter);
> +}
> +
> +void init() {
> +  struct CallbackContext ctx;
> +  ctx.counter = 0;
> +  register_callback( my_callback, &ctx );
> +}
> +```
> +
> +The difference here is subtle, but important.  If a piece of code
> +*appears compatible with contexts*, then you are *allowed to think
> +that way*, but if a piece of code *explicitly states it uses
> +contexts*, then you are *required to follow that approach*.
> +

> +For example, imagine someone modified `MyStruct` in the earlier example
> +to count several unrelated events across the whole program.  That would mean
> +it contained information about multiple entities, so was not a context.
> +But nobody ever *said* it was a context, so that isn't necessarily wrong.
> +However, proposing the same change to the `CallbackContext` in the later example
> +would violate a guarantee, and should be pointed out in a code review.
> +

I'm not very convinced by the callback example. The use of contexts in
the FFmpeg API is very much simpler, it is used to keep track of
configuration and state (that is they track the "object" where to
operate on), so the callback example here is a bit misleading.

Callbacks are used in the internals to implement different elements
(codecs, protocols, filters, etc...) implementing a common API, but in
this case the relation with "contexts" is less straightforward.

> + at warning Guaranteeing to use contexts does not mean guaranteeing to use
> +object-oriented programming.  For example, FFmpeg creates its contexts
> +procedurally instead of with constructors.

I'm afraid this is more confusing than helpful, since the FFmpeg API
is not OOP. I'd drop this sentence.

> +
> +## Contexts in the real world
> +
> +To understand how contexts are used in the real world, it might be
> +useful to compare [curl's MD5 hash context](https://github.com/curl/curl/blob/bbeeccdea8507ff50efca70a0b33d28aef720267/lib/curl_md5.h#L48)
> +with @ref AVMD5 "FFmpeg's equivalent context".
> +

> +The [MD5 algorithm](https://en.wikipedia.org/wiki/MD5) produces
> +a fixed-length digest from arbitrary-length data.  It does this by calculating
> +the digest for a prefix of the data, then loading the next part and adding it
> +to the previous digest, and so on.  Projects that use MD5 generally use some
> +kind of context, so comparing them can reveal differences between projects.
> +
> +```c
> +// Curl's MD5 context looks like this:
> +struct MD5_context {
> +  const struct MD5_params *md5_hash;    /* Hash function definition */
> +  void                  *md5_hashctx;   /* Hash function context */
> +};
> +
> +// FFmpeg's MD5 context looks like this:
> +typedef struct AVMD5 {
> +    uint64_t len;
> +    uint8_t  block[64];
> +    uint32_t ABCD[4];
> +} AVMD5;
> +```
> +
> +Curl's struct name ends with `_context`, guaranteeing contexts are the correct
> +interpretation.  FFmpeg's struct does not explicitly say it's a context, but
> + at ref libavutil/md5.c "its functions do" so we can reasonably assume
> +it's the intended interpretation.
> +
> +Curl's struct uses `void *md5_hashctx` to avoid guaranteeing
> +implementation details in the public interface, whereas FFmpeg makes
> +everything accessible.  This kind of data hiding is an advanced context-oriented
> +convention, and is discussed below.  Using it in this case has strengths and
> +weaknesses.  On one hand, it means changing the layout in a future version
> +of curl won't break downstream programs that used that data.  On the other hand,
> +the MD5 algorithm has been stable for 30 years, so it's arguably more important
> +to let people dig in when debugging their own code.
> +
> +Curl's struct is declared as `struct <type> { ... }`, whereas FFmpeg uses
> +`typedef struct <type> { ... } <type>`.  These conventions are used with both
> +context and non-context structs, so don't say anything about contexts as such.
> +Specifically, FFmpeg's convention is a workaround for an issue with C grammar:
> +
> +```c
> +void my_function( ... ) {
> +  int                my_var;        // good
> +  MD5_context        my_curl_ctx;   // error: C needs you to explicitly say "struct"
> +  struct MD5_context my_curl_ctx;   // good: added "struct"
> +  AVMD5              my_ffmpeg_ctx; // good: typedef's avoid the need for "struct"
> +}
> +```
> +
> +Both MD5 implementations are long-tested, widely-used examples of contexts
> +in the real world.  They show how contexts can solve the same problem
> +in different ways.

I'm concerned that this is adding more information than really
needed. Especially comparing with internals of curl means that now the
docs needs to be kept in synch also with the curl's API, meaning that
it will be outdated very soon. I'd rather drop the curl comparison
altogether.

> +
> +## FFmpeg's advanced context-oriented conventions
> +
> +Projects that make heavy use of contexts tend to develop conventions
> +to make them more useful.  This section discusses conventions used in FFmpeg,
> +some of which are used in other projects, others are unique to this project.
> +
> +### Naming: “Context” and “ctx”
> +
> +```c
> +// Context struct names usually end with `Context`:
> +struct AVSomeContext {
> +  ...
> +};
> +
> +// Functions are usually named after their context,
> +// context parameters usually come first and are often called `ctx`:
> +void av_some_function( AVSomeContext *ctx, ... );
> +```
> +
> +If an FFmpeg struct is intended for use as a context, its name usually
> +makes that clear.  Exceptions to this rule include AVMD5 (discussed above),
> +which is only identified as a context by the functions that call it.
> +
> +If a function is associated with a context, its name usually
> +begins with some variant of the context name (e.g. av_md5_alloc()
> +or avcodec_alloc_context3()).  Exceptions to this rule include
> + at ref avformat.h "AVFormatContext's functions", many of which
> +begin with just `av_`.
> +
> +If a function has a context parameter, it usually comes first and its name
> +often contains `ctx`.  Exceptions include av_bsf_alloc(), which puts the
> +context argument second to emphasise it's an out variable.
> +
> +### Data hiding: private contexts
> +
> +```c
> +// Context structs often hide private context:
> +struct AVSomeContext {
> +  void *priv_data; // sometimes just called "internal"
> +};
> +```
> +
> +Contexts usually present a public interface, so changing a context's members
> +forces everyone that uses the library to at least recompile their program,
> +if not rewrite it to remain compatible.  Hiding information in a private context
> +ensures it can be modified without affecting downstream software.
> +
> +Object-oriented programmers may be tempted to compare private contexts to
> +*private class members*.  That's often accurate, but for example it can also
> +be used like a *virtual function table* - a list of functions that are
> +guaranteed to exist, but may be implemented differently for different
> +sub-classes.  When thinking about private contexts, remember that FFmpeg
> +isn't *large enough* to need some common OOP techniques, even though it's
> +solving a problem that's *complex enough* to benefit from some rarer techniques.
> +
> +### Manage lifetime: allocate, initialize and free
> +
> +```c
> +void my_function( ... ) {
> +
> +    // Context structs are allocated then initialized with associated functions:
> +
> +    AVSomeContext *ctx = av_some_context_alloc( ... );
> +
> +    // ... configure ctx ...
> +
> +    av_some_context_init( ctx, ... );
> +
> +    // ... use ctx ...
> +
> +    // Context structs are freed with associated functions:
> +
> +    av_some_context_free( ctx );
> +
> +}
> +```
> +
> +FFmpeg contexts go through the following stages of life:
> +
> +1. allocation (often a function that ends with `_alloc`)
> +   * a range of memory is allocated for use by the structure
> +   * memory is allocated on boundaries that improve caching
> +   * memory is reset to zeroes, some internal structures may be initialized
> +2. configuration (implemented by setting values directly on the object)
> +   * no function for this - calling code populates the structure directly
> +   * memory is populated with useful values
> +   * simple contexts can skip this stage
> +3. initialization (often a function that ends with `_init`)
> +   * setup actions are performed based on the configuration (e.g. opening files)
> +5. normal usage
> +   * most functions are called in this stage
> +   * documentation implies some members are now read-only (or not used at all)
> +   * some contexts allow re-initialization
> +6. closing (often a function that ends with `_close()`)
> +   * teardown actions are performed (e.g. closing files)
> +7. deallocation (often a function that ends with `_free()`)
> +   * memory is returned to the pool of available memory
> +
> +This can mislead object-oriented programmers, who expect something more like:
> +
> +1. allocation (usually a `new` keyword)
> +   * a range of memory is allocated for use by the structure
> +   * memory *may* be reset (e.g. for security reasons)
> +2. initialization (usually a constructor)
> +   * memory is populated with useful values
> +   * related setup actions are performed based on arguments (e.g. opening files)
> +3. normal usage
> +   * most functions are called in this stage
> +   * compiler enforces that some members are read-only (or private)
> +   * no going back to the previous stage
> +4. finalization (usually a destructor)
> +   * teardown actions are performed (e.g. closing files)
> +5. deallocation (usually a `delete` keyword)
> +   * memory is returned to the pool of available memory
> +
> +FFmpeg's allocation stage is broadly similar to OOP, but can do some higher-level
> +operations.  For example, AVOptions-enabled structs (discussed below) contain an
> +AVClass member that is set during allocation.
> +
> +FFmpeg's "configuration" and "initialization" stages combine to resemble OOP's
> +"initialization" stage.  This can mislead object-oriented developers,
> +who are used to doing both at once.  This means FFmpeg contexts don't have
> +a direct equivalent of OOP constructors, as they would be doing
> +two jobs in one function.
> +
> +FFmpeg's three-stage creation process is useful for complicated structures.
> +For example, AVCodecContext contains many members that *can* be set before
> +initialization, but in practice most programs set few if any of them.
> +Implementing this with a constructor would involve a function with a list
> +of arguments that was extremely long and changed whenever the struct was
> +updated.  For contexts that don't need the extra flexibility, FFmpeg usually
> +provides a combined allocator and initializer function.  For historical reasons,
> +suffixes like `_alloc`, `_init`, `_alloc_context` and even `_open` can indicate
> +the function does any combination of allocation and initialization.
> +
> +FFmpeg's "closing" stage is broadly similar to OOP's "finalization" stage,
> +but some contexts allow re-initialization after finalization.  For example,
> +SwrContext lets you call swr_close() then swr_init() to reuse a context.
> +
> +FFmpeg's "deallocation" stage is broadly similar to OOP, but can perform some
> +higher-level functions (similar to the allocation stage).
> +
> +Very few contexts need the flexibility of separate "closing" and
> +"deallocation" stages, so these are usually combined into a single function.
> +Closing functions usually end with "_close", while deallocation
> +functions usually end with "_free".
> +
> +### Reflection: AVOptions-enabled structs
> +
> +Object-oriented programming puts more focus on data hiding than FFmpeg needs,
> +but it also puts less focus on
> +[reflection](https://en.wikipedia.org/wiki/Reflection_(computer_programming)).
> +
> +To understand FFmpeg's reflection requirements, run `ffmpeg -h full` on the
> +command-line, then ask yourself how you would implement all those options
> +with the C standard [`getopt` function](https://en.wikipedia.org/wiki/Getopt).
> +You can also ask the same question for any other programming languages you know.
> +[Python's argparse module](https://docs.python.org/3/library/argparse.html)
> +is a good example - its approach works well with far more complex programs
> +than `getopt`, but would you like to maintain an argparse implementation
> +with 15,000 options and growing?
> +
> +Most solutions assume you can just put all options in a single block,
> +which is unworkable at FFmpeg's scale.  Instead, we split configuration
> +across many *AVOptions-enabled structs*, which use the @ref avoptions
> +"AVOptions API" to reflect information about their user-configurable members,
> +including members in private contexts.
> +
> +An *AVOptions-enabled struct* is a struct that contains an AVClass element as
> +its first member, and uses that element to provide access to instances of
> +AVOption, each of which provides information about a single option.
> +The AVClass can also include more @ref AVClass "AVClasses" for private contexts,
> +making it possible to set options through the API that aren't
> +accessible directly.
> +
> +AVOptions-accessible members of a context should be accessed through the
> +AVOptions API whenever possible, even if they're not hidden away in a private
> +context.  That ensures values are validated as they're set, and means you won't
> +have to do as much work if a future version of FFmpeg changes the layout.
> +
> +AVClass was created very early in FFmpeg's history, long before AVOptions.
> +Its name suggests some kind of relationship to an OOP
> +base [class](https://en.wikipedia.org/wiki/Class_(computer_programming)),
> +but the name has become less accurate as FFmpeg evolved, to the point where
> +AVClass and AVOption are largely synonymous in modern usage.  The difference
> +might still matter if you need to support old versions of FFmpeg,
> +where you might find *AVClass context structures* (contain an AVClass element
> +as their first member) that are not *AVOptions-enabled* (don't use that element
> +to provide access to instances of AVOption).
> +
> +Object-oriented programmers may be tempted to compare @ref avoptions "AVOptions"
> +to OOP getters and setters.  There is some overlap in functionality, but OOP
> +getters and setters are usually specific to a single member and don't provide
> +metadata about the member; whereas AVOptions has a single API that covers
> +every option, and provides help text etc. as well.
> +
> +Object-oriented programmers may be tempted to compare AVOptions-accessible
> +members of a public context to protected members of a class.  Both provide
> +global access through an API, and unrestricted access for trusted friends.
> +But this is just a happy accident, not a guarantee.

This part looks fine, although there is too much OOP jargon for my
taste: this would make reading for programmers not familiar with OOP
harder than needed since she will miss many references.

> +
> +## Final example: context for a codec
> +
> +AVCodecContext is an AVOptions-enabled struct that contains information
> +about encoding or decoding one stream of data (e.g. the video in a movie).
> +It's a good example of many of the issues above.
> +
> +The name "AVCodecContext" tells us this is a context.  Many of
> + at ref libavcodec/avcodec.h "its functions" start with an `avctx` parameter,
> +indicating this object provides context for that function.
> +
> +AVCodecContext::internal contains the private context.  For example,
> +codec-specific information might be stored here.
> +
> +AVCodecContext is allocated with avcodec_alloc_context3(), initialized with
> +avcodec_open2(), and freed with avcodec_free_context().  Most of its members
> +are configured with the @ref avoptions "AVOptions API", but for example you
> +can set AVCodecContext::opaque or AVCodecContext::draw_horiz_band() if your
> +program happens to need them.
> +
> +AVCodecContext provides an abstract interface to many different *codecs*.
> +Options supported by many codecs (e.g. "bitrate") are kept in AVCodecContext
> +and reflected as AVOptions.  Options that are specific to one codec are
> +stored in the internal context, and reflected from there.
> +
> +To support a specific codec, AVCodecContext's private context is set to
> +an encoder-specific data type.  For example, the video codec
> +[H.264](https://en.wikipedia.org/wiki/Advanced_Video_Coding) is supported via
> +[the x264 library](https://www.videolan.org/developers/x264.html), and
> +implemented in X264Context.

> Although included in the documentation, X264Context is not part of the public API.

Why included in the doc? That is a private struct and therefore should
not be included in the doxy.

> +[...] Whereas there are strict rules about
> +changing AVCodecContext, a version of FFmpeg could modify X264Context or
> +replace it with another type altogether.  An adverse legal ruling or security
> +problem could even force us to switch to a completely different library
> +without a major version bump.
> +
> +The design of AVCodecContext provides several important guarantees:
> +
> +- lets you use the same interface for any codec
> +- supports common encoder options like "bitrate" without duplicating code
> +- supports encoder-specific options like "profile" without bulking out the public interface
> +- reflects both types of options to users, with help text and detection of missing options
> +- hides implementation details (e.g. its encoding buffer)
> -- 
> 2.43.0
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".