[FFmpeg-devel] [PATCH v5 1/4] doc: Explain what "context" means

Thu May 23 23:00:40 EEST 2024

Derived from explanations kindly provided by Stefano Sabatini and others:
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-April/325903.html
---
 doc/context.md | 439 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 439 insertions(+)
 create mode 100644 doc/context.md

diff --git a/doc/context.md b/doc/context.md
new file mode 100644
index 0000000000..21469a6e58
--- /dev/null
+++ b/doc/context.md
@@ -0,0 +1,439 @@
+ at page Context Introduction to contexts
+
+ at tableofcontents
+
+“Context” is a name for a widely-used programming idiom.
+This document explains the general idiom and some conventions used by FFmpeg.
+
+This document uses object-oriented analogies for readers familiar with
+[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming).
+But contexts can also be used outside of OOP, and even in situations where OOP
+isn't helpful.  So these analogies should only be used as a rough guide.
+
+ at section Context_general “Context” as a general concept
+
+Many projects use some kind of “context” idiom.  You can safely skip this
+section if you have used contexts in another project.  You might also prefer to
+read @ref Context_comparison before continuing with the rest of the document.
+
+ at subsection Context_think “Context” as a way to think about code
+
+A context is any data structure that is passed to several functions
+(or several instances of the same function) that all operate on the same entity.
+For example, [object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)
+languages usually provide member functions with a `this` or `self` value:
+
+```python
+# Python methods (functions within classes) must start with an object argument,
+# which does a similar job to a context:
+class MyClass:
+    def my_func(self):
+        ...
+```
+
+Contexts can also be used in C-style procedural code.  If you have ever written
+a callback function, you have probably used a context:
+
+```c
+struct FileReader {
+    FILE* file;
+};
+
+int my_callback(void *my_var_, uint8_t* buf, int buf_size) {
+
+    // my_var provides context for the callback function:
+    struct FileReader *my_var = (struct FileReader *)my_var_;
+
+    return read(my_var->file, sizeof(*buf), buf_size);
+}
+
+void init() {
+
+    struct FileReader my_var;
+    my_var->file = fopen("my-file", "rb");
+
+    register_callback(my_callback, &my_var);
+
+    ...
+
+    fclose( my_var->file );
+
+}
+```
+
+In the broadest sense, a context is just a way to think about some code.
+You can even use it to think about code written by people who have never
+heard the term, or who would disagree with you about what it means.
+But when FFmpeg developers say “context”, they're usually talking about
+a more specific set of conventions.
+
+ at subsection Context_communication “Context” as a tool of communication
+
+“Context“ can just be a word to understand code in your own head,
+but it can also be a term you use to explain your interfaces.
+Here is a version of the callback example that makes the context explicit:
+
+```c
+struct FileReaderContext {
+    FILE *file;
+};
+
+int my_callback(void *ctx_, uint8_t *buf, int buf_size) {
+
+    // ctx provides context for the callback function:
+    struct FileReaderContext *ctx = (struct FileReaderContext *)ctx_;
+
+    return read(ctx->file, sizeof(*buf), buf_size);
+}
+
+void init() {
+
+    struct FileReader ctx;
+    ctx->file = fopen("my-file", "rb");
+
+    register_callback(my_callback, &ctx);
+
+    ...
+
+    fclose( ctx->file );
+
+}
+```
+
+The difference here is subtle, but important.  If a piece of code
+*appears compatible with contexts*, then you are *allowed to think
+that way*, but if a piece of code *explicitly states it uses
+contexts*, then you are *required to follow that approach*.
+
+For example, take a look at avio_alloc_context().
+The function name and return value both state it uses contexts,
+so failing to follow that approach is a bug you can report.
+But its arguments are a set of callbacks that merely appear compatible with
+contexts, so it's fine to write a `read_packet` function that just reads
+from standard input.
+
+When a programmer says their code is "a context", they're guaranteeing
+to follow a set of conventions enforced by their community - for example,
+the FFmpeg community enforces that contexts have separate allocation,
+configuration, and initialization steps.  That's different from saying
+their code is "an object", which normally guarantees to follow conventions
+enforced by their programming language (e.g. using a constructor function).
+
+ at section Context_ffmpeg FFmpeg contexts
+
+This section discusses specific context-related conventions used in FFmpeg.
+Some of these are used in other projects, others are unique to this project.
+
+ at subsection Context_naming Naming: “Context” and “ctx”
+
+```c
+// Context struct names usually end with `Context`:
+struct AVSomeContext {
+  ...
+};
+
+// Functions are usually named after their context,
+// context parameters usually come first and are often called `ctx`:
+void av_some_function(AVSomeContext *ctx, ...);
+```
+
+If an FFmpeg struct is intended for use as a context, its name usually
+makes that clear.  Exceptions to this rule include AVMD5, which is only
+identified as a context by @ref libavutil/md5.c "the functions that call it".
+
+If a function is associated with a context, its name usually
+begins with some variant of the context name (e.g. av_md5_alloc()
+or avcodec_alloc_context3()).  Exceptions to this rule include
+ at ref avformat.h "AVFormatContext's functions", many of which
+begin with just `av_`.
+
+If a function has a context parameter, it usually comes first and its name
+often contains `ctx`.  Exceptions include av_bsf_alloc(), which puts the
+context argument second to emphasise it's an out variable.
+
+Some functions fit awkwardly within FFmpeg's context idiom.  For example,
+av_ambient_viewing_environment_create_side_data() creates an
+AVAmbientViewingEnvironment context, then adds it to the side-data of an
+AVFrame context.  If you find contexts a useful metaphor in these cases,
+you might prefer to think of these functions as "receiving" and "producing"
+contexts.
+
+ at subsection Context_data_hiding Data hiding: private contexts
+
+```c
+// Context structs often hide private context:
+struct AVSomeContext {
+  void *priv_data; // sometimes just called "internal"
+};
+```
+
+Contexts present a public interface, so changing a context's members forces
+everyone that uses the library to at least recompile their program,
+if not rewrite it to remain compatible.  Many contexts reduce this problem
+by including a private context with a type that is not exposed in the public
+interface.  Hiding information this way ensures it can be modified without
+affecting downstream software.
+
+Private contexts often store variables users aren't supposed to see
+(similar to an OOP private block), but can also store information shared between
+some but not all instances of a context (e.g. codec-specific functionality),
+and @ref Context_avoptions "AVOptions-enabled structs" can include options
+that are accessible through the @ref avoptions "AVOptions API".
+Object-oriented programmers thinking about private contexts should remember
+that FFmpeg isn't *large enough* to need some common object-oriented techniques,
+even though it's solving a problem *complex enough* to benefit from
+some rarer techniques.
+
+ at subsection Context_lifetime Manage lifetime: allocate, initialize and free
+
+```c
+void my_function( ... ) {
+
+    // Context structs are allocated then initialized with associated functions:
+
+    AVSomeContext *ctx = av_some_context_alloc(...);
+
+    // ... configure ctx ...
+
+    av_some_context_init(ctx, ...);
+
+    // ... use ctx ...
+
+    // Context structs are freed with associated functions:
+
+    av_some_context_close(ctx);
+    av_some_context_free(ctx);
+
+}
+```
+
+FFmpeg contexts go through the following stages of life:
+
+1. allocation (often a function that ends with `_alloc`)
+   * a range of memory is allocated for use by the structure
+   * memory is allocated on boundaries that improve caching
+   * memory is reset to zeroes, some internal structures may be initialized
+2. configuration (implemented by setting values directly on the context)
+   * no function for this - calling code populates the structure directly
+   * memory is populated with useful values
+   * simple contexts can skip this stage
+3. initialization (often a function that ends with `_init`)
+   * setup actions are performed based on the configuration (e.g. opening files)
+5. normal usage
+   * most functions are called in this stage
+   * documentation implies some members are now read-only (or not used at all)
+   * some contexts allow re-initialization
+6. closing (often a function that ends with `_close()`)
+   * teardown actions are performed (e.g. closing files)
+7. deallocation (often a function that ends with `_free()`)
+   * memory is returned to the pool of available memory
+
+This can mislead object-oriented programmers, who expect something more like:
+
+1. allocation (usually a `new` keyword)
+   * a range of memory is allocated for use by the structure
+   * memory *may* be reset (e.g. for security reasons)
+2. initialization (usually a constructor)
+   * memory is populated with useful values
+   * related setup actions are performed based on arguments (e.g. opening files)
+3. normal usage
+   * most functions are called in this stage
+   * compiler enforces that some members are read-only (or private)
+   * no going back to the previous stage
+4. finalization (usually a destructor)
+   * teardown actions are performed (e.g. closing files)
+5. deallocation (usually a `delete` keyword)
+   * memory is returned to the pool of available memory
+
+FFmpeg's allocation stage is broadly similar to the OOP stage of the same name.
+Both set aside some memory for use by a new entity, but FFmpeg's stage can also
+do some higher-level operations.  For example, @ref Context_avoptions
+"AVOptions-enabled structs" set their AVClass member during allocation.
+
+FFmpeg's configuration stage involves setting any variables you want to before
+you start using the context.  Complicated FFmpeg structures like AVCodecContext
+tend to have many members you *could* set, but in practice most programs set
+few if any of them.  The freeform configuration stage works better than bundling
+these into the initilization stage, which would lead to functions with
+impractically many parameters, and would mean each new option was an
+incompatible change to the API.
+
+FFmpeg's initialization stage involves calling a function that sets the context
+up based on your configuration.
+
+FFmpeg's first three stages do the same job as OOP's first two stages.
+This can mislead object-oriented developers, who expect to do less work in the
+allocation stage, and more work in the initialization stage.  To simplify this,
+most FFmpeg contexts provide a combined allocator and initializer function.
+For historical reasons, suffixes like `_alloc`, `_init`, `_alloc_context` and
+even `_open` can indicate the function does any combination of allocation and
+initialization.
+
+FFmpeg's "closing" stage is broadly similar to OOP's "finalization" stage,
+but some contexts allow re-initialization after finalization.  For example,
+SwrContext lets you call swr_close() then swr_init() to reuse a context.
+Be aware that some FFmpeg functions happen to use the word "finalize" in a way
+that has nothing to do with the OOP stage (e.g. av_bsf_list_finalize()).
+
+FFmpeg's "deallocation" stage is broadly similar to OOP, but can perform some
+higher-level functions (similar to the allocation stage).
+
+Closing functions usually end with "_close", while deallocation
+functions usually end with "_free".  Very few contexts need the flexibility of
+separate "closing" and "deallocation" stages, so many "_free" functions
+implicitly close the context first.
+
+ at subsection Context_avoptions Reflection: AVOptions-enabled structs
+
+Object-oriented programming puts more focus on data hiding than FFmpeg needs,
+but it also puts less focus on
+[reflection](https://en.wikipedia.org/wiki/Reflection_(computer_programming)).
+
+To understand FFmpeg's reflection requirements, run `ffmpeg -h full` on the
+command-line, then ask yourself how you would implement all those options
+with the C standard [`getopt` function](https://en.wikipedia.org/wiki/Getopt).
+You can also ask the same question for any other programming languages you know.
+[Python's argparse module](https://docs.python.org/3/library/argparse.html)
+is a good example - its approach works well with far more complex programs
+than `getopt`, but would you like to maintain an argparse implementation
+with 15,000 options and growing?
+
+Most solutions assume you can just put all options in a single block,
+which is unworkable at FFmpeg's scale.  Instead, we split configuration
+across many *AVOptions-enabled structs*, which use the @ref avoptions
+"AVOptions API" to reflect information about their user-configurable members,
+including members in private contexts.
+
+AVOptions-accessible members of a context should be accessed through the
+ at ref avoptions "AVOptions API" whenever possible, even if they're not hidden
+in a private context.  That ensures values are validated as they're set, and
+means you won't have to do as much work if a future version of FFmpeg changes
+the allowed values.  This is broadly similar to the way object-oriented programs
+recommend getters and setters over direct access.
+
+Object-oriented programmers may be tempted to compare AVOptions-accessible
+members of a public context to protected members of a class.  Both provide
+global access through an API, and unrestricted access for trusted friends.
+But this is just a happy accident, not a guarantee.
+
+ at subsection Context_logging Logging: AVClass context structures
+
+FFmpeg's @ref lavu_log "logging facility" needs to be simple to use,
+but flexible enough to let people debug problems.  And much like reflection,
+it needs to work the same across a wide variety of unrelated structs.
+
+FFmpeg structs that support the logging framework are called *@ref AVClass
+context structures*.  The name @ref AVClass was chosen early in FFmpeg's
+development, but in practice it only came to store information about
+logging, and about introspection.
+
+ at section Context_further Further information about contexts
+
+So far, this document has provided a theoretical guide to FFmpeg contexts.
+This final section provides some alternative approaches to the topic,
+which may help round out your understanding.
+
+ at subsection Context_example Learning by example: context for a codec
+
+It can help to learn contexts by doing a deep dive into a specific struct.
+This section will discuss AVCodecContext - an AVOptions-enabled struct
+that contains information about encoding or decoding one stream of data
+(e.g. the video in a movie).
+
+The name "AVCodecContext" tells us this is a context.  Many of
+ at ref libavcodec/avcodec.h "its functions" start with an `avctx` parameter,
+indicating this parameter provides context for that function.
+
+AVCodecContext::internal contains the private context.  For example,
+codec-specific information might be stored here.
+
+AVCodecContext is allocated with avcodec_alloc_context3(), initialized with
+avcodec_open2(), and freed with avcodec_free_context().  Most of its members
+are configured with the @ref avoptions "AVOptions API", but for example you
+can set AVCodecContext::opaque or AVCodecContext::draw_horiz_band() if your
+program happens to need them.
+
+AVCodecContext provides an abstract interface to many different *codecs*.
+Options supported by many codecs (e.g. "bitrate") are kept in AVCodecContext
+and reflected as AVOptions.  Options that are specific to one codec are
+stored in the private context, and reflected from there.
+
+AVCodecContext::av_class contains logging metadata to ensure all codec-related
+error messages look the same, plus implementation details about options.
+
+To support a specific codec, AVCodecContext's private context is set to
+an encoder-specific data type.  For example, the video codec
+[H.264](https://en.wikipedia.org/wiki/Advanced_Video_Coding) is supported via
+[the x264 library](https://www.videolan.org/developers/x264.html), and
+implemented in X264Context.  Although included in the documentation, X264Context
+is not part of the public API.  That means FFmpeg's @ref ffmpeg_versioning
+"strict rules about changing public structs" aren't as important here, so a
+version of FFmpeg could modify X264Context or replace it with another type
+altogether.  An adverse legal ruling or security problem could even force us to
+switch to a completely different library without a major version bump.
+
+The design of AVCodecContext provides several important guarantees:
+
+- lets you use the same interface for any codec
+- supports common encoder options like "bitrate" without duplicating code
+- supports encoder-specific options like "profile" without bulking out the public interface
+- reflects both types of options to users, with help text and detection of missing options
+- provides uniform logging output
+- hides implementation details (e.g. its encoding buffer)
+
+ at subsection Context_comparison Learning by comparison: FFmpeg vs. Curl contexts
+
+It can help to learn contexts by comparing how different projects tackle
+similar problems.  This section will compare @ref AVMD5 "FFmpeg's MD5 context"
+with [curl 8.8.0's equivalent](https://github.com/curl/curl/blob/curl-8_8_0/lib/md5.c#L48).
+
+The [MD5 algorithm](https://en.wikipedia.org/wiki/MD5) produces
+a fixed-length digest from arbitrary-length data.  It does this by calculating
+the digest for a prefix of the data, then loading the next part and adding it
+to the previous digest, and so on.
+
+```c
+// FFmpeg's MD5 context looks like this:
+typedef struct AVMD5 {
+    uint64_t len;
+    uint8_t  block[64];
+    uint32_t ABCD[4];
+} AVMD5;
+
+// Curl 8.8.0's MD5 context looks like this:
+struct MD5_context {
+  const struct MD5_params *md5_hash;    /* Hash function definition */
+  void                  *md5_hashctx;   /* Hash function context */
+};
+```
+
+Curl's struct name ends with `_context`, guaranteeing contexts are the correct
+interpretation.  FFmpeg's struct does not explicitly say it's a context, but
+ at ref libavutil/md5.c "its functions do" so we can reasonably assume
+it's the intended interpretation.
+
+Curl's struct uses `void *md5_hashctx` to avoid guaranteeing
+implementation details in the public interface, whereas FFmpeg makes
+everything accessible.  This disagreement about data hiding is a good example
+of how contexts can be used differently.  Hiding the data means changing the
+layout in a future version of curl won't break downstream programs that used
+that data.  But the MD5 algorithm has been stable for 30 years, and making the
+data public makes it easier for people to follow a bug in their own code.
+
+Curl's struct is declared as `struct <type> { ... }`, whereas FFmpeg uses
+`typedef struct <type> { ... } <type>`.  These conventions are used with both
+context and non-context structs, so don't say anything about contexts as such.
+Specifically, FFmpeg's convention is a workaround for an issue with C grammar:
+
+```c
+void my_function( ... ) {
+  int                my_var;        // good
+  MD5_context        my_curl_ctx;   // error: C needs you to explicitly say "struct"
+  struct MD5_context my_curl_ctx;   // good: added "struct"
+  AVMD5              my_ffmpeg_ctx; // good: typedef's avoid the need for "struct"
+}
+```
+
+Both MD5 implementations are long-tested, widely-used examples of contexts
+in the real world.  They show how contexts can solve the same problem
+in different ways.
-- 
2.43.0