[FFmpeg-devel] [PATCH v4 3/6] Pass secondary ogg/opus chained streams metadata.

Romain Beauxis romain.beauxis at gmail.com
Fri Feb 14 17:48:54 EET 2025


Le jeu. 13 févr. 2025 à 17:37, Lynne <dev at lynne.ee> a écrit :
>
>
>
> On 14/02/2025 00:27, Romain Beauxis wrote:
> > Le jeu. 13 févr. 2025 à 15:53, Lynne <dev at lynne.ee> a écrit :
> >>
> >> On 10/02/2025 20:25, Romain Beauxis wrote:
> >>> These changes parse ogg/opus comment in secondary chained ogg/opus
> >>> streams and attach them as extradata to the corresponding packet.
> >>>
> >>> They can then be decoded in the opus decoder and attached to the next
> >>> decoded frame.
> >>>
> >>> libavformat/oggparseopus.c: Parse comments from
> >>>    secondary chained ogg/opus streams and pass them as ogg stream new_metadata.
> >>>
> >>> libavcodec/opus/dec.c: Unpack comments from secondary chained ogg/opus
> >>> streams and attach them to the next decoded frame.
> >>> ---
> >>>    libavcodec/opus/dec.c      | 25 ++++++++++++++++++++++---
> >>>    libavformat/oggparseopus.c | 16 +++++++++++++++-
> >>>    2 files changed, 37 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/libavcodec/opus/dec.c b/libavcodec/opus/dec.c
> >>> index 88a650c81c..cddcefcb5f 100644
> >>> --- a/libavcodec/opus/dec.c
> >>> +++ b/libavcodec/opus/dec.c
> >>> @@ -125,6 +125,8 @@ typedef struct OpusContext {
> >>>        AVFloatDSPContext *fdsp;
> >>>        float   gain;
> >>>
> >>> +    AVDictionary *pending_metadata;
> >>> +
> >>>        OpusParseContext p;
> >>>    } OpusContext;
> >>>
> >>> @@ -485,12 +487,24 @@ static int opus_decode_packet(AVCodecContext *avctx, AVFrame *frame,
> >>>        int decoded_samples = INT_MAX;
> >>>        int delayed_samples = 0;
> >>>        int i, ret;
> >>> +    size_t size;
> >>> +    const uint8_t *side_metadata;
> >>>
> >>> -    if (buf_size > 8 && (
> >>> -       !memcmp(buf, "OpusHead", 8) ||
> >>> -       !memcmp(buf, "OpusTags", 8)))
> >>> +    if (buf_size > 8 && !memcmp(buf, "OpusHead", 8))
> >>>            return buf_size;
> >>>
> >>> +    if (buf_size > 8 && !memcmp(buf, "OpusTags", 8)) {
> >>> +        /* New metadata */
> >>> +        side_metadata = av_packet_get_side_data(avpkt, AV_PKT_DATA_METADATA_UPDATE, &size);
> >>> +        if (side_metadata) {
> >>> +            av_dict_free(&c->pending_metadata);
> >>> +            ret = av_packet_unpack_dictionary(side_metadata, size, &c->pending_metadata);
> >>> +            if (ret < 0)
> >>> +                return ret;
> >>> +        }
> >>> +        return buf_size;
> >>> +    }
> >>
> >> This is definitely not the right way to go about it. You're changing the
> >> packet layout too, which can potentially break existing users.
> >
> > Thanks for looking into this!
> >
> >> What you should do instead is to either create a linearized dictionary
> >> (e.g. <keylength><key><value_len><value> in a buffer), and pass that as
> >> extradata, or add a dictionary field to AVPacket, which would then be
> >> copied over to the frame after decoding, similar to how PTS, duration,
> >> etc. are carried over from packets.
> >
> > I'm confused: isn't it exactly what the ogg demuxer is already doing:
> >
> >      if (os->new_metadata) {
> >          ret = av_packet_add_side_data(pkt, AV_PKT_DATA_METADATA_UPDATE,
> >                                        os->new_metadata, os->new_metadata_size);
> >          if (ret < 0)
> >              return ret;
> >
> >          os->new_metadata      = NULL;
> >          os->new_metadata_size = 0;
> >      }
>
> OpusHead et. al. shouldn't appear in the data of AVPacket ever.

These strings are part of the expected ogg packeting.

The ogg spec mandates that the first packet is a header packet then
the following packets are comment packets.

OpusHead and OpusTags are the headers indicating such packets. They
are part of the original format encapsulation, originally designed for
ogg streams.

Part of the changes that I intend to do is to surface those packets in
the demuxer.

In ogg/flac, typically, they are exported by the demuxer and visible
when operating at the packet level.

With opus, because the header is parsed again, they are swallowed by
the demuxer and never surfaced.

It is my understanding that, eventually, we want to be able to:
- open a demuxer on any ogg stream
- open a muxer for a new ogg stream
- pass all packets from demuxer to muxer

and obtain a valid ogg stream output.

There's still some work to do to achieve that with chained ogg streams
because of the current PTS/DTS reset but surfacing those packets
should be the first step.

> > (Note that this code is already in the ogg decoder and not part of
> > those changes)
> >
> > This is also in response (and personal agreement) with Marvin's
> > feedback here: https://ffmpeg.org/pipermail/ffmpeg-devel/2025-January/338993.html
> >
> >> Having to add custom handling to each codec to handle metadata updates
> >> is not good design.
> >
> > Could you explain more in detail what you mean?
> >
> > I really am confused, the mechanism here is generic, using a flat
> > dictionary stored as AV_PKT_DATA_METADATA_UPDATE extra data and copied
> > over to the frame after decoding.
> >
> > This seems like what you describe? What am I missing?
>
> You have code in each decoder to handle this. You shouldn't - the
> generic decode.c framework should forward the updates. Decoders
> shouldn't care about what a demuxer did, they simply decode.

Thanks, that makes a lot of sense.

I have reworked the code to add the update mechanism in libavcodec/decode.c

You can see the updated patch set here:
https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/6#issuecomment-87

Before I submit a new version here, is that a method you agree on?

-- Romain


More information about the ffmpeg-devel mailing list