[FFmpeg-devel] [PATCH 1/2] avcodec/s302m: enable non-PCM decoding

Thu Jan 25 15:17:39 EET 2024

Gyan Doshi:
> 
> 
> On 2024-01-25 10:29 am, Andreas Rheinhardt wrote:
>> Gyan Doshi:
>>> Set up framework for non-PCM decoding in-place and
>>> add support for Dolby-E decoding.
>>>
>>> Useful for direct transcoding of non-PCM audio in live inputs.
>>> ---
>>>   configure          |   1 +
>>>   doc/decoders.texi  |  40 +++
>>>   libavcodec/s302m.c | 609 +++++++++++++++++++++++++++++++++++++--------
>>>   3 files changed, 543 insertions(+), 107 deletions(-)
>>>
>>> diff --git a/configure b/configure
>>> index c8ae0a061d..8db3fa3f4b 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -2979,6 +2979,7 @@ rv20_decoder_select="h263_decoder"
>>>   rv20_encoder_select="h263_encoder"
>>>   rv30_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>>   rv40_decoder_select="golomb h264pred h264qpel mpegvideodec rv34dsp"
>>> +s302m_decoder_select="dolby_e_decoder"
>>>   screenpresso_decoder_deps="zlib"
>>>   shorten_decoder_select="bswapdsp"
>>>   sipr_decoder_select="lsp"
>>> diff --git a/doc/decoders.texi b/doc/decoders.texi
>>> index 293c82c2ba..9f85c876bf 100644
>>> --- a/doc/decoders.texi
>>> +++ b/doc/decoders.texi
>>> @@ -347,6 +347,46 @@ configuration. You need to explicitly configure
>>> the build with
>>>   An FFmpeg native decoder for Opus exists, so users can decode Opus
>>>   without this library.
>>>   + at section s302m
>>> +
>>> +SMPTE ST 302 decoder.
>>> +
>>> +SMPTE ST 302 is a method for storing AES3 data format within an MPEG
>>> Transport
>>> +Stream. AES3 streams can contain LPCM streams of 2, 4, 6 or 8
>>> channels with a
>>> +bit depth of 16, 20 or 24-bits at a sample rate of 48 kHz.
>>> +They can also contain non-PCM codec streams such as AC-3 or Dolby-E.
>>> +
>> This sounds like we should add bitstream filters to extract the proper
>> underlying streams instead.
>> (I see only two problems with this approach: The BSF API needs to set
>> the CodecID of the output during init, but at this point no packet has
>> reached the BSF to determine it. And changing codec IDs mid-stream is
>> also not supported.)
> 
> In theory, this decoder shouldn't exist, as it is just a carrier,
> whether of LPCM or non-PCM.
> FFmpeg architecture also imposes a fundamental limitation in that one
> s302m stream may
> carry multiple payload streams and we support only one decoding context
> per input stream

Then why does the demuxer not separate the data into multiple streams?

> neither can a bsf spawn streams (not sure). So proper, full support
> seems not possible.

Hypothetically a BSF can output packets with different stream_ids. But
no BSF does this intentionally (all BSFs actually ignore stream_id; they
just copy the id from the input.

> 
> [...]
> 
>>> +
>>> +    ret = init_get_bits8(&gb, buf, buf_size);
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>> +    aes_frm_size = (s->bits + 4) * 2 / 8;
>>> +    if (buf_size < aes_frm_size * 2)  // not enough to contain
>>> data_type & length_code
>>> +        return AVERROR_INVALIDDATA;
>>> +
>>> +    state = get_bits64(&gb, aes_frm_size * 8);
>>> +
>>> +    while (!IS_NONPCMSYNC(s->bits,state) && (get_bits_left(&gb) >= 8))
>>> +        state = (state << 8) | get_bits(&gb, 8);
>> Reading byte-aligned data with a GetBit context is very suboptimal.
> 
> What is the performance difference vs. uint8 pointers?

Typically worse by a constant factor.

> Note that if stream is LPCM or non-decodable non-PCM, this isn't called
> again. If it is Dolby-E, the data traversed can typically be measured in
> the dozens of bytes. And further on, I do read and skip some
> non-byte-aligned lengths.
> 
> [...]
> 
>>> +
>>> +            if (s->non_pcm_dec)
>>> +                for (int i = 0; i < 4; i++)
>>> +                    *p++ = b[i];
>>> +            else {
>>> +                *f16++ = (b[0] << 8) |
>>> +                         (b[1]     ) ;
>> AV_RB16(b)
> 
> Ok.
> 
> [...]
>>> +
>>> +        for (int ch = 0; ch < s->frame->ch_layout.nb_channels; ch++)
>>> +            memcpy(frame->extended_data[ch],
>>> s->frame->extended_data[ch],
>>> +                  
>>> av_get_bytes_per_sample(s->non_pcm_ctx->sample_fmt) *
>>> s->frame->nb_samples);
>> Would you please explain to me why this extra frame s->frame exists at
>> all? (Is it just the assert due to the missing FrameDecodeData? If so,
>> then this should be changed instead.)
> 
> Yes, that assert was triggered. I haven't looked into the ramifications
> of altering decode_receive_frame_internal and it's out of scope for this
> patch.
> If you feel strongly about it, I invite you to change that code and I'll
> update this patch accordingly.
> 
> [...]
> 
>>> +static av_cold int s302m_close(AVCodecContext *avctx)
>>> +{
>>> +    S302Context *s = avctx->priv_data;
>>> +
>>> +    avcodec_free_context(&s->non_pcm_ctx);
>>> +    av_packet_free(&s->packet);
>>> +    av_frame_free(&s->frame);
>>> +    av_dict_free(&s->non_pcm_opts);
>> non_pcm_opts is an av_opt-enabled field and is therefore freed
>> generically.
> 
> Will remove.
> 
> Regards,
> Gyan
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".