[FFmpeg-devel] [PATCH] avcodec/bsf: restructure the internal implementation of the bistream filter API

Sat Apr 18 00:56:16 EEST 2020

James Almer:
> On 4/17/2020 5:40 PM, Andreas Rheinhardt wrote:
>> James Almer:
>>> On 4/17/2020 3:49 PM, Andreas Rheinhardt wrote:
>>>> James Almer:
>>>>> Process input data as soon as it's fed to av_bsf_send_packet(), instead of
>>>>> storing a single packet and expecting the user to call av_bsf_receive_packet()
>>>>> in order to trigger the decoding process before they are allowed to feed more
>>>>> data.
>>>>>
>>>>> This puts the bsf API more in line with the decoupled decode API, without
>>>>> breaking existing code using it.
>>>>>
>>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>>> ---
>>>>> Benefits from this change include less constrains to how to use the API (Now
>>>>> you can feed as many packets as the filter will accept, including the flush
>>>>> call, before attempting to fetch the first output packet), and actually
>>>>> honoring the part of the doxy for av_bsf_receive_packet() that says that pkt is
>>>>> not touched on failure.
>>>>
>>>> Given that the packet is supposed to be blank on input, its content will
>>>> be preserved even on error. Granted, memcmp() might show same
>>>> differences, but I'd rather see this (non-)issue eliminated by stating
>>>> that the packet will be blank on error instead of untouched.
>>>> For this patch this means that one can directly return
>>>> ctx->filter->filter(ctx, pkt) in av_bsf_receive_packet().
>>>>>
>>>>> Drawback from this change is that now all bsfs will always receive non-writable
>>>>> packets, so filters like noise and mpeg4_unpack_bframes will not be able to
>>>>> do their work in-place anymore. This is because av_bsf_send_packet() will now
>>>>> trigger the filtering process, and by passing the input packet reference to the
>>>>> underlying bsf, any alteration in case of failure would go against the doxy
>>>>> statement that pkt is untouched on such scenarios. So a new reference must be
>>>>> used instead.
>>>>>
>>>> This IMO underestimates the consequences of this change for bsfs that
>>>> don't care about the writability of the packets. It makes a bsf thought
>>>> to be a simple pass-through bsf considerably more expensive. This
>>>> applies to e.g. vp9_superframe which is auto-inserted (without checking
>>>> the actual content) for muxing VP9 and it would also apply to the
>>>> proposed pgs_frame_merge bsf.
>>>>
>>>> Of course, the null bsf is another one of those supposedly free bsfs and
>>>> that's why it's added in ff_decode_bsfs_init() to every decoder if there
>>>> is no other bsf to add.
>>>>
>>>> Furthermore, this would also apply to callers that have no use for the
>>>> packet besides sending it to a bsf (and that would simply unref it on
>>>> failure). Given that (on success) ownership passes to the bsf this
>>>> probably includes most current callers.
>>>
>>> Yeah, the extra reference is definitely annoying and could be costly.
>>> It's one of the reasons i was not too sure if i should have bothered to
>>> send this patch.
>>>
>>>>
>>>> In other words: I don't like this patch.
>>>>
>>>> I see two solutions for this. The first is outlined at the end of this
>>>> mail, but I don't think it is acceptable because it probably amounts to
>>>> an API break. The second is an idea off the top of my head. It might be
>>>> bullshit.
>>>>
>>>> The second solution mitigates this by adding a new function
>>>> av_set_ownership_status() or so. It allows one to send updated
>>>> information to the bsf to override the behaviour currently documented in
>>>> av_bsf_send/receive_packet() (the default behaviour would of course stay
>>>> unchanged). With av_set_ownership_status() one can indicate whether
>>>> a) the bsf should not take ownership of the reference at all, but always
>>>> make a copy,
>>>> b) the bsf should leave the packet untouched in case of an error, but
>>>> retain ownership of the reference sent to it in case of success.
>>>> c) the bsf should take take ownership of the packet even on error (and
>>>> unref the packet sent), unless the error is caused by the bsf already
>>>> being full (i.e. if the error would be AVERROR(EAGAIN))
>>>> d) the bsf takes ownership of the packet it received in the moment it
>>>> consumes the packet, i.e. buffers it internally (your proposed code
>>>> would try to return an unmodified packet to the caller if
>>>> ctx->filter->filter() fails which is problematic (see the discussion at
>>>> the end)).
>>>
>>> This overcomplicates the API too much. For this kind of behavioral
>>> change, it would IMO be better to implement it as a new init() (or even
>>> send/receive) function that takes flags as an argument, something that
>>> is also extensible for other purposes.
>>>
>> Yes, but I did not want to deprecate the whole API just for this.
> 
> You don't need to deprecate any function. A second, more feature-rich
> init() function can coexist just fine with the existing one. See the
> AVBufferPool API for an example.
> You add a new function that takes a flags parameter to implement
> specific behavior in the filtering process. The original init() function
>  then in practice behaves the same as calling the new one with flags ==
> 0, or however the default/historical behavior should be signaled with it.
> 
> Implementing it as new send/receive functions however would most likely
> need to replace the existing ones, true, so that should be avoided.
> 
I might try a new init function, but not right now; yet my original
approach would be cleaner. And it would also directly be applicable to
bsf-lists whereas one would need a new init function for bsf-lists
right, too.
Furthermore, if one always wants to add a certain bsf for muxing a
certain codec, one might just as well do it in the init function and not
later when a packet is received. But at this point it is not known yet
whether one uses the interleaved codepath or not and so one doesn't know
how to set the bsf.
(Adding bsf in the init function actually makes much sense if the
decision whether to add a bsf is only based upon the stream's codecpar.
This e.g. applies to the decision to add the *_mp4toannexb bsf. If one
added them during init, one could make it so that the muxer already gets
the filtered codecpar during write_header(). If one relegated the
annexb->mp4 conversion to a bsf, one could simply write the extradata as
CodecPrivate in Matroska and similarly mp4.)
>>
>>>>
>>>> The new status would apply to all future calls to av_bsf_send_packet().
>>>>
>>>> The same could also be done for avcodec_send_packet() and
>>>> avcodec_send_frame() (in which case the function would be allowed to
>>>> cast the const from the AVPacket/AVFrame away). One could even use one
>>>> function for all three -- one can distinguish AVBSFContext and
>>>> AVCodecContext via the AVClass.
>>>>
>>>> This would also allow an easy way to solve the problem that in the
>>>> noninterleaved codepath in libavformat/mux.c the input packet is
>>>> destroyed when a bsf is autoinserted despite the function not taking
>>>> ownership of the packet: In the noninterleaved codepath, we would choose
>>>> option a) (this should be done when the bsf is added); in the
>>>> interleaved codepath, we would choose option c).
>>>>
>>>>>  libavcodec/avcodec.h |  6 +---
>>>>>  libavcodec/bsf.c     | 68 +++++++++++++++++++++++++++-----------------
>>>>>  2 files changed, 43 insertions(+), 31 deletions(-)
>>>>>
>>>>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>>>>> index 55151a0b71..e60f2ac1ce 100644
>>>>> --- a/libavcodec/avcodec.h
>>>>> +++ b/libavcodec/avcodec.h
>>>>> @@ -4720,10 +4720,6 @@ int av_bsf_init(AVBSFContext *ctx);
>>>>>  /**
>>>>>   * Submit a packet for filtering.
>>>>>   *
>>>>> - * After sending each packet, the filter must be completely drained by calling
>>>>> - * av_bsf_receive_packet() repeatedly until it returns AVERROR(EAGAIN) or
>>>>> - * AVERROR_EOF.
>>>>> - *
>>>>>   * @param pkt the packet to filter. The bitstream filter will take ownership of
>>>>>   * the packet and reset the contents of pkt. pkt is not touched if an error occurs.
>>>>>   * If pkt is empty (i.e. NULL, or pkt->data is NULL and pkt->side_data_elems zero),
>>>>
>>>> In your changes to av_bsf_send_packet(), you explicitly filter the
>>>> return value AVERROR(EAGAIN) out among the return values of the filter,
>>>> so that if the bsf has been completely drained before this call, it does
>>>> not output AVERROR(EAGAIN). Yet you do not document this whereas
>>>> avcodec_send_packet() does:
>>>>
>>>> AVERROR(EAGAIN):   input is not accepted in the current state - user
>>>>                    must read output with avcodec_receive_frame() (once
>>>>                    all output is read, the packet should be resent, and
>>>>                    the call will not fail with EAGAIN).
>>>
>>> You missed the doxy change i sent in a separate patch that i haven't yet
>>> pushed because i'm waiting for another patch by Anton to go in first.
>>> See http://lists.ffmpeg.org/pipermail/ffmpeg-devel/2020-April/260194.html
>>>
>>> That aside, the EAGAIN it filters out is the return value of
>>> ctx->filter->filter, which is not relevant to send_packet() and will be
>>> returned by receive_packet() instead.
>>> EAGAIN here is only meant to be returned when the input packet isn't
>>> accepted. The exact same behavior as avcodec_send_packet().
>>>
>>>>
>>>>
>>>>> @@ -4755,7 +4751,7 @@ int av_bsf_send_packet(AVBSFContext *ctx, AVPacket *pkt);
>>>>>   * an error occurs.
>>>>>   *
>>>>>   * @note one input packet may result in several output packets, so after sending
>>>>> - * a packet with av_bsf_send_packet(), this function needs to be called
>>>>> + * a packet with av_bsf_send_packet(), this function may need to be called
>>>>>   * repeatedly until it stops returning 0. It is also possible for a filter to
>>>>>   * output fewer packets than were sent to it, so this function may return
>>>>>   * AVERROR(EAGAIN) immediately after a successful av_bsf_send_packet() call.
>>>>> diff --git a/libavcodec/bsf.c b/libavcodec/bsf.c
>>>>> index 7b96183e64..97d86beb6f 100644
>>>>> --- a/libavcodec/bsf.c
>>>>> +++ b/libavcodec/bsf.c
>>>>> @@ -28,7 +28,8 @@
>>>>>  #include "bsf.h"
>>>>>  
>>>>>  struct AVBSFInternal {
>>>>> -    AVPacket *buffer_pkt;
>>>>> +    AVPacket *in_pkt;
>>>>> +    AVPacket *out_pkt;
>>>>>      int eof;
>>>>>  };
>>>>>  
>>>>> @@ -45,8 +46,10 @@ void av_bsf_free(AVBSFContext **pctx)
>>>>>      if (ctx->filter->priv_class && ctx->priv_data)
>>>>>          av_opt_free(ctx->priv_data);
>>>>>  
>>>>> -    if (ctx->internal)
>>>>> -        av_packet_free(&ctx->internal->buffer_pkt);
>>>>> +    if (ctx->internal) {
>>>>> +        av_packet_free(&ctx->internal->in_pkt);
>>>>> +        av_packet_free(&ctx->internal->out_pkt);
>>>>> +    }
>>>>>      av_freep(&ctx->internal);
>>>>>      av_freep(&ctx->priv_data);
>>>>>  
>>>>> @@ -110,8 +113,9 @@ int av_bsf_alloc(const AVBitStreamFilter *filter, AVBSFContext **pctx)
>>>>>      }
>>>>>      ctx->internal = bsfi;
>>>>>  
>>>>> -    bsfi->buffer_pkt = av_packet_alloc();
>>>>> -    if (!bsfi->buffer_pkt) {
>>>>> +    bsfi->in_pkt = av_packet_alloc();
>>>>> +    bsfi->out_pkt = av_packet_alloc();
>>>>> +    if (!bsfi->in_pkt || !bsfi->out_pkt) {
>>>>>          ret = AVERROR(ENOMEM);
>>>>>          goto fail;
>>>>>      }
>>>>> @@ -183,7 +187,8 @@ void av_bsf_flush(AVBSFContext *ctx)
>>>>>  
>>>>>      bsfi->eof = 0;
>>>>>  
>>>>> -    av_packet_unref(bsfi->buffer_pkt);
>>>>> +    av_packet_unref(bsfi->in_pkt);
>>>>> +    av_packet_unref(bsfi->out_pkt);
>>>>>  
>>>>>      if (ctx->filter->flush)
>>>>>          ctx->filter->flush(ctx);
>>>>> @@ -204,21 +209,38 @@ int av_bsf_send_packet(AVBSFContext *ctx, AVPacket *pkt)
>>>>>          return AVERROR(EINVAL);
>>>>>      }
>>>>>  
>>>>> -    if (bsfi->buffer_pkt->data ||
>>>>> -        bsfi->buffer_pkt->side_data_elems)
>>>>> +    if (bsfi->in_pkt->data ||
>>>>> +        bsfi->in_pkt->side_data_elems)
>>>>>          return AVERROR(EAGAIN);
>>>>>  
>>>>> -    ret = av_packet_make_refcounted(pkt);
>>>>> +    ret = av_packet_ref(bsfi->in_pkt, pkt);
>>>>>      if (ret < 0)
>>>>>          return ret;
>>>>> -    av_packet_move_ref(bsfi->buffer_pkt, pkt);
>>>>> +
>>>>> +    if (!bsfi->out_pkt->data && !bsfi->out_pkt->side_data_elems) {
>>>>> +        ret = ctx->filter->filter(ctx, bsfi->out_pkt);
>>>>> +        if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)
>>>>> +            return ret;
>>>>> +    }
>>>>> +
>>>>> +    av_packet_unref(pkt);
>>>>
>>>> Consider the following scenario: You send a packet to a bsf and in the
>>>> first call to ctx->filter->filter (happening here above) the bsf take
>>>> bsf->in_pkt; let's presume this call is successfull and generates an
>>>> out_pkt. Then the caller collects his packet with
>>>> av_bsf_receive_packet(). In the current API, the caller would now have
>>>> to call av_bsf_receive_packet() until it returns AVERROR(EAGAIN) before
>>>> sending another packet. Yet in your new API, the caller may send a new
>>>> packet immediately and in this scenario the filter will be called again.
>>>>
>>>> Let's presume the first packet contained enough data for more than one
>>>> output packet, so that the bsf will first use the data is has already
>>>> cached internally (an example of this is Marton's proposed
>>>> pcm_rechunk_bsf) without collecting the new input packet. Let's also
>>>> presume that the actual filtering will fail. Then the caller still has
>>>> his packet, yet a copy/reference also exists in in_pkt.
>>>
>>> in_pkt could be unreferenced if the ctx->filter->filter() call failed
>>> within av_bsf_send_packet().
>>>
>> Yet the caller does not know whether it was this packet which caused the
>> error and therefore does not know whether he should try to resend it
>> (potentially after flushing).
>>
>>>>
>>>> What is the caller now supposed to do with the packet he has? Submit it
>>>> again for filtering?
>>>
>>> In case of failure, you should abort the process or flush and attempt to
>>> resume filtering in a fresh state. Trying to send more data after you
>>> got an invalid data error is not going to get good results.
>>>
>> If one sends data different from the data that led to the invalid data
>> error one might very well get good results. That is what is being done
>> for ages in ffmpeg.c as well as mux.c and probably a few other places.
>> That's also why it is so important to know whether it was the data that
>> one just sent that caused the error or not.
>>
>> Your advice is btw against the current documentation (the part which you
>> want to remove, but not because of the error handling)
> 
> True that the CLI does not abort on error by default. But I'm not sure
> what part of my doxy change is against the above advice. I only
> documented the fact av_bsf_send_packet() returning EAGAIN must not be
> considered an error.
> 
There seems to be a misunderstanding here: Your linked patch (which I
agree with) documents that AVERROR(EAGAIN) is not a real error. Yet this
very patch wants to remove this part of the doc:

After sending each packet, the filter must be completely drained by
calling av_bsf_receive_packet() repeatedly until it returns
AVERROR(EAGAIN) or AVERROR_EOF.

And this says that one should drain the filter even on error, which
indeed is against yur advice to abort completely or flush the bsf.

(I am of course aware that your intentions to remove these lines had
nothing to do with error handling.)

> In any case, I'll withdraw this patch for now. Like you mentioned, an
> extra reference created in what ultimately would be passthrough cases,
> like autoinserted bsfs when they are not needed, or the null_bsf
> placeholder in lavc, can be expensive enough that the potential gains in
> this approach would not be justified.
> 
>> which actually
>> says that one should call av_bsf_receive_packet() until one gets
>> AVERROR(EAGAIN) or AVERROR_EOF, making no exception for other errors.
>> That's what Marton and I agreed upon in [1].
>> Btw: ffmpeg.c does not drain the filter chain completely if there are
>> errors; as a result, one of the filters might not accept new input in
>> the future in which case the packet sent to it will leak.
>>
>>>> In the scenario outlined above, this would mean
>>>> that a packet is effectively sent twice to the bsf. It would also mean
>>>> that for an ordinary one-in-one-out bsf an invalid packet might lead to
>>>> an infinite loop if the caller tries to resend it again and again.
>>>>
>>>> If I were to design a new API from scratch, my answer would be: The way
>>>> the caller can see if the packet has been accepted is by looking at
>>>> whether the packet has been consumed (i.e. whether the returned packet
>>>> is blank).
>>>> This would of course be coupled with "on success the packet
>>>> has been consumed" and "if it has not been consumed, the packet is
>>>> untouched".
>>>
>>> That's currently the case, isn't it? Consumed on success, untouched on
>>> failure, as stated in the doxy. Or am i misunderstanding you?
>>>
>> The current behaviour is indeed consistent with this, because
>> av_bsf_send_packet() does nothing more after it has moved the packet to
>> the internal list. The differences only become apparent if one changes
>> this to already filter the packets in av_bsf_send_packet():
>> In my proposal the packet might be consumed on error if the packet could
>> be moved to the internal list (which in this proposal could be done
>> without creating a new reference), but if filtering fails lateron. This
>> is a real deviation from current documented behaviour.
>>
>>>> In this case, there would be no need to create new references: We take
>>>> ownership of the reference as soon as we know that
>>>> av_packet_make_refcounted() succeeds.
>>>>
>>>> - Andreas
>>
>> - Andreas
>>
>> [1]: https://ffmpeg.org/pipermail/ffmpeg-devel/2020-April/260190.html