[FFmpeg-devel] [PATCH 1/3] lavc: introduce a new decoding/encoding API with decoupled input/output

Wed Apr 20 17:08:04 CEST 2016

On Wed, 20 Apr 2016 16:54:30 +0200
Michael Niedermayer <michael at niedermayer.cc> wrote:

> On Tue, Apr 19, 2016 at 11:49:11AM +0200, wm4 wrote:
> > Until now, the decoding API was restricted to outputting 0 or 1 frames
> > per input packet. It also enforces a somewhat rigid dataflow in general.
> > 
> > This new API seeks to relax these restrictions by decoupling input and
> > output. Instead of doing a single call on each decode step, which may
> > consume the packet and may produce output, the new API requires the user
> > to send input first, and then ask for output.
> > 
> > For now, there are no codecs supporting this API. The API can work with
> > codecs using the old API, and most code added here is to make them
> > interoperate. The reverse is not possible, although for audio it might.
> > 
> > From Libav commit 05f66706d182eb0c36af54d72614bf4c33e957a9.
> > 
> > Signed-off-by: Anton Khirnov <anton at khirnov.net>
> > ---
> > This commit was skipped when merging from Libav.
> > ---
> >  doc/APIchanges         |   5 +
> >  libavcodec/avcodec.h   | 241 ++++++++++++++++++++++++++++++++++++++++-
> >  libavcodec/internal.h  |  13 +++
> >  libavcodec/utils.c     | 286 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >  libavcodec/version.h   |   2 +-
> >  libavformat/avformat.h |   4 +-
> >  6 files changed, 543 insertions(+), 8 deletions(-)
> > 
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index 8a14e77..ef69e98 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -15,6 +15,11 @@ libavutil:     2015-08-28
> >  
> >  API changes, most recent first:
> >  
> > +2016-xx-xx - xxxxxxx - lavc 57.36.0 - avcodec.h
> > +  Add a new audio/video encoding and decoding API with decoupled input
> > +  and output -- avcodec_send_packet(), avcodec_receive_frame(),
> > +  avcodec_send_frame() and avcodec_receive_packet().
> > +
> >  2016-xx-xx - xxxxxxx - lavc 57.15.0 - avcodec.h
> >    Add a new bitstream filtering API working with AVPackets.
> >    Deprecate the old bistream filtering API.
> > diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> > index 9a8a0f0..61de80e 100644
> > --- a/libavcodec/avcodec.h
> > +++ b/libavcodec/avcodec.h
> > @@ -73,6 +73,95 @@
> >   */
> >  
> >  /**
> > + * @ingroup libavc
> > + * @defgroup lavc_encdec send/receive encoding and decoding API overview
> > + * @{
> > + *
> > + * The avcodec_send_packet()/avcodec_receive_frame()/avcodec_send_frame()/
> > + * avcodec_receive_packet() functions provide an encode/decode API, which
> > + * decouples input and output.
> > + *
> > + * The API is very similar for encoding/decoding and audio/video, and works as
> > + * follows:
> > + * - Set up and open the AVCodecContext as usual.
> > + * - Send valid input:
> > + *   - For decoding, call avcodec_send_packet() to give the decoder raw
> > + *     compressed data in an AVPacket.
> > + *   - For encoding, call avcodec_send_frame() to give the decoder an AVFrame
> > + *     containing uncompressed audio or video.
> > + *   In both cases, it is recommended that AVPackets and AVFrames are
> > + *   refcounted, or libavcodec might have to copy the input data. (libavformat
> > + *   always returns refcounted AVPackets, and av_frame_get_buffer() allocates
> > + *   refcounted AVFrames.)
> > + * - Receive output in a loop. Periodically call one of the avcodec_receive_*()
> > + *   functions and process their output:
> > + *   - For decoding, call avcodec_receive_frame(). On success, it will return
> > + *     an AVFrame containing uncompressed audio or video data.
> > + *   - For encoding, call avcodec_receive_packet(). On success, it will return
> > + *     an AVPacket with a compressed frame.
> > + *   Repeat this call until it returns AVERROR(EAGAIN) or an error. The
> > + *   AVERROR(EAGAIN) return value means that new input data is required to
> > + *   return new output. In this case, continue with sending input. For each
> > + *   input frame/packet, the codec will typically return 1 output frame/packet,
> > + *   but it can also be 0 or more than 1.
> > + *
> > + * At the beginning of decoding or encoding, the codec might accept multiple
> > + * input frames/packets without returning a frame, until its internal buffers
> > + * are filled. This situation is handled transparently if you follow the steps
> > + * outlined above.
> > + *
> > + * End of stream situations. These require "flushing" (aka draining) the codec,
> > + * as the codec might buffer multiple frames or packets internally for
> > + * performance or out of necessity (consider B-frames).
> > + * This is handled as follows:
> > + * - Instead of valid input, send NULL to the avcodec_send_packet() (decoding)
> > + *   or avcodec_send_frame() (encoding) functions. This will enter draining
> > + *   mode.
> > + * - Call avcodec_receive_frame() (decoding) or avcodec_receive_packet()
> > + *   (encoding) in a loop until AVERROR_EOF is returned. The functions will
> > + *   not return AVERROR(EAGAIN), unless you forgot to enter draining mode.
> > + * - Before decoding can be resumed again, the codec has to be reset with
> > + *   avcodec_flush_buffers().
> > + *
> > + * Using the API as outlined above is highly recommended. But it is also
> > + * possible to call functions outside of this rigid schema. For example, you can
> > + * call avcodec_send_packet() repeatedly without calling
> > + * avcodec_receive_frame(). In this case, avcodec_send_packet() will succeed
> > + * until the codec's internal buffer has been filled up (which is typically of
> > + * size 1 per output frame, after initial input), and then reject input with
> > + * AVERROR(EAGAIN). Once it starts rejecting input, you have no choice but to
> > + * read at least some output.
> > + *
> > + * Not all codecs will follow a rigid and predictable dataflow; the only
> > + * guarantee is that an AVERROR(EAGAIN) return value on a send/receive call on
> > + * one end implies that a receive/send call on the other end will succeed. In
> > + * general, no codec will permit unlimited buffering of input or output.
> > + *
> > + * This API replaces the following legacy functions:
> > + * - avcodec_decode_video2() and avcodec_decode_audio4():
> > + *   Use avcodec_send_packet() to feed input to the decoder, then use
> > + *   avcodec_receive_frame() to receive decoded frames after each packet.
> > + *   Unlike with the old video decoding API, multiple frames might result from
> > + *   a packet. For audio, splitting the input packet into frames by partially
> > + *   decoding packets becomes transparent to the API user. You never need to
> > + *   feed an AVPacket to the API twice.
> > + *   Additionally, sending a flush/draining packet is required only once.
> > + * - avcodec_encode_video2()/avcodec_encode_audio2():
> > + *   Use avcodec_send_frame() to feed input to the encoder, then use
> > + *   avcodec_receive_packet() to receive encoded packets.
> > + *   Providing user-allocated buffers for avcodec_receive_packet() is not
> > + *   possible.
> > + * - The new API does not handle subtitles yet.
> > + *
> > + * Mixing new and old function calls on the same AVCodecContext is not allowed,  
> 
> > + * and will result in arbitrary behavior.  
>                         ^^^^^^^^^^
> probably "undefined" is the better word but its fine as is too

OK, changed it locally.

> 
> [...]
> > @@ -3522,6 +3613,21 @@ typedef struct AVCodec {
> >      int (*decode)(AVCodecContext *, void *outdata, int *outdata_size, AVPacket *avpkt);
> >      int (*close)(AVCodecContext *);
> >      /**
> > +     * Decode/encode API with decoupled packet/frame dataflow. The API is the
> > +     * same as the avcodec_ prefixed APIs (avcodec_send_frame() etc.), except
> > +     * that:
> > +     * - never called if the codec is closed or the wrong type,
> > +     * - AVPacket parameter change side data is applied right before calling
> > +     *   AVCodec->send_packet,
> > +     * - if AV_CODEC_CAP_DELAY is not set, drain packets or frames are never sent,
> > +     * - only one drain packet is ever passed down (until the next flush()),
> > +     * - a drain AVPacket is always NULL (no need to check for avpkt->size).
> > +     */
> > +    int (*send_frame)(AVCodecContext *avctx, const AVFrame *frame);
> > +    int (*send_packet)(AVCodecContext *avctx, const AVPacket *avpkt);
> > +    int (*receive_frame)(AVCodecContext *avctx, AVFrame *frame);
> > +    int (*receive_packet)(AVCodecContext *avctx, AVPacket *avpkt);
> > +    /**
> >       * Flush buffers.
> >       * Will be called when seeking
> >       */  
> 
> This breaks ABI
> the functions must be added after caps_internal or a major bump is
> needed
> caps_internal is accessed in libavformat/utils.c it seems

Well, thees fields are clearly below this comment:

    /*****************************************************************
     * No fields below this line are part of the public API. They
     * may not be used outside of libavcodec and can be changed and
     * removed at will.
     * New public fields should be added right above.
     *****************************************************************
     */

So if libavformat does this, it violates the ABI.

> 
> 
> [...]
> > @@ -2655,6 +2692,243 @@ void avsubtitle_free(AVSubtitle *sub)
> >      memset(sub, 0, sizeof(AVSubtitle));
> >  }
> >  
> > +static int do_decode(AVCodecContext *avctx, AVPacket *pkt)
> > +{
> > +    int got_frame;
> > +    int ret;
> > +
> > +    av_assert0(!avctx->internal->buffer_frame->buf[0]);
> > +
> > +    if (!pkt)
> > +        pkt = avctx->internal->buffer_pkt;
> > +
> > +    // This is the lesser evil. The field is for compatibility with legacy users
> > +    // of the legacy API, and users using the new API should not be forced to
> > +    // even know about this field.
> > +    avctx->refcounted_frames = 1;
> > +  
> 
> > +    // Some codecs (at least wma lossless) will crash when feeding drain packets
> > +    // after EOF was signaled.
> > +    if (avctx->internal->draining_done)
> > +        return AVERROR_EOF;  
> 
> maybe decoders ca be fixed to avoid this, can you add a TODO note or
> something in the code maybe something like
> // TODO: check if decoders can be changed to avoid this check and the draining_done field
> 
> that is unless this is needed for something else

I looked at the wma code (at the time when I wrote the patch), and it
didn't look simple. I concluded it's better not to change this, as
strictly speaking you really shouldn't feed a decoder more drain
packets after it's done. (Other decoders tend to exhibit weird behavior
too.)

The new API enforces more robustness, and the draining_done field is
internal, so I'm fine with this for now.

> rest should be ok
> 
> thx
> 
> [...]