[FFmpeg-devel] [PATCH v4 3/6] Pass secondary ogg/opus chained streams metadata.

Lynne dev at lynne.ee
Fri Feb 14 19:55:21 EET 2025


On 14/02/2025 17:50, Romain Beauxis wrote:
> Le ven. 14 févr. 2025 à 10:18, Lynne <dev at lynne.ee> a écrit :
>>
>>
>>
>> On 14/02/2025 16:48, Romain Beauxis wrote:
>>> Le jeu. 13 févr. 2025 à 17:37, Lynne <dev at lynne.ee> a écrit :
>>>>
>>>>
>>>>
>>>> On 14/02/2025 00:27, Romain Beauxis wrote:
>>>>> Le jeu. 13 févr. 2025 à 15:53, Lynne <dev at lynne.ee> a écrit :
>>>>>>
>>>>>> On 10/02/2025 20:25, Romain Beauxis wrote:
>>>>>>> These changes parse ogg/opus comment in secondary chained ogg/opus
>>>>>>> streams and attach them as extradata to the corresponding packet.
>>>>>>>
>>>>>>> They can then be decoded in the opus decoder and attached to the next
>>>>>>> decoded frame.
>>>>>>>
>>>>>>> libavformat/oggparseopus.c: Parse comments from
>>>>>>>      secondary chained ogg/opus streams and pass them as ogg stream new_metadata.
>>>>>>>
>>>>>>> libavcodec/opus/dec.c: Unpack comments from secondary chained ogg/opus
>>>>>>> streams and attach them to the next decoded frame.
>>>>>>> ---
>>>>>>>      libavcodec/opus/dec.c      | 25 ++++++++++++++++++++++---
>>>>>>>      libavformat/oggparseopus.c | 16 +++++++++++++++-
>>>>>>>      2 files changed, 37 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/libavcodec/opus/dec.c b/libavcodec/opus/dec.c
>>>>>>> index 88a650c81c..cddcefcb5f 100644
>>>>>>> --- a/libavcodec/opus/dec.c
>>>>>>> +++ b/libavcodec/opus/dec.c
>>>>>>> @@ -125,6 +125,8 @@ typedef struct OpusContext {
>>>>>>>          AVFloatDSPContext *fdsp;
>>>>>>>          float   gain;
>>>>>>>
>>>>>>> +    AVDictionary *pending_metadata;
>>>>>>> +
>>>>>>>          OpusParseContext p;
>>>>>>>      } OpusContext;
>>>>>>>
>>>>>>> @@ -485,12 +487,24 @@ static int opus_decode_packet(AVCodecContext *avctx, AVFrame *frame,
>>>>>>>          int decoded_samples = INT_MAX;
>>>>>>>          int delayed_samples = 0;
>>>>>>>          int i, ret;
>>>>>>> +    size_t size;
>>>>>>> +    const uint8_t *side_metadata;
>>>>>>>
>>>>>>> -    if (buf_size > 8 && (
>>>>>>> -       !memcmp(buf, "OpusHead", 8) ||
>>>>>>> -       !memcmp(buf, "OpusTags", 8)))
>>>>>>> +    if (buf_size > 8 && !memcmp(buf, "OpusHead", 8))
>>>>>>>              return buf_size;
>>>>>>>
>>>>>>> +    if (buf_size > 8 && !memcmp(buf, "OpusTags", 8)) {
>>>>>>> +        /* New metadata */
>>>>>>> +        side_metadata = av_packet_get_side_data(avpkt, AV_PKT_DATA_METADATA_UPDATE, &size);
>>>>>>> +        if (side_metadata) {
>>>>>>> +            av_dict_free(&c->pending_metadata);
>>>>>>> +            ret = av_packet_unpack_dictionary(side_metadata, size, &c->pending_metadata);
>>>>>>> +            if (ret < 0)
>>>>>>> +                return ret;
>>>>>>> +        }
>>>>>>> +        return buf_size;
>>>>>>> +    }
>>>>>>
>>>>>> This is definitely not the right way to go about it. You're changing the
>>>>>> packet layout too, which can potentially break existing users.
>>>>>
>>>>> Thanks for looking into this!
>>>>>
>>>>>> What you should do instead is to either create a linearized dictionary
>>>>>> (e.g. <keylength><key><value_len><value> in a buffer), and pass that as
>>>>>> extradata, or add a dictionary field to AVPacket, which would then be
>>>>>> copied over to the frame after decoding, similar to how PTS, duration,
>>>>>> etc. are carried over from packets.
>>>>>
>>>>> I'm confused: isn't it exactly what the ogg demuxer is already doing:
>>>>>
>>>>>        if (os->new_metadata) {
>>>>>            ret = av_packet_add_side_data(pkt, AV_PKT_DATA_METADATA_UPDATE,
>>>>>                                          os->new_metadata, os->new_metadata_size);
>>>>>            if (ret < 0)
>>>>>                return ret;
>>>>>
>>>>>            os->new_metadata      = NULL;
>>>>>            os->new_metadata_size = 0;
>>>>>        }
>>>>
>>>> OpusHead et. al. shouldn't appear in the data of AVPacket ever.
>>>
>>> These strings are part of the expected ogg packeting.
>>>
>>> The ogg spec mandates that the first packet is a header packet then
>>> the following packets are comment packets.
>>>
>>> OpusHead and OpusTags are the headers indicating such packets. They
>>> are part of the original format encapsulation, originally designed for
>>> ogg streams.
>>
>> Ogg tags and data has exactly no business being in an Opus bitstream.
>> Packets are supposed to contain pure codec data with no container
>> features or data in them.
>>
>>> Part of the changes that I intend to do is to surface those packets in
>>> the demuxer.
>>>
>>> In ogg/flac, typically, they are exported by the demuxer and visible
>>> when operating at the packet level.
>>>
>>> With opus, because the header is parsed again, they are swallowed by
>>> the demuxer and never surfaced.
>>>
>>> It is my understanding that, eventually, we want to be able to:
>>> - open a demuxer on any ogg stream
>>> - open a muxer for a new ogg stream
>>> - pass all packets from demuxer to muxer
>>>
>>> and obtain a valid ogg stream output.
>>>
>>> There's still some work to do to achieve that with chained ogg streams
>>> because of the current PTS/DTS reset but surfacing those packets
>>> should be the first step.
>>
>> We *want* them to be swallowed by the demuxer.
>> You're changing the packet structure by letting them through, breaking
>> users. Even if you weren't, and this was a new codec, you're letting
>> container data through in a codec packet.
>>
>> You do not need to let them through, since you can use the presence of
>> side data to detect this.

> Just to make it clear: I am talking here about subsequent ogg header
> packets in chained ogg streams, not the initial ogg packets.
> 
> Those packets are required by the spec for chained bitstreams:
> https://xiph.org/ogg/doc/framing.html

The muxer/demuxer are responsible for writing and parsing container units.
Decoders have nothing to do with them.
Encoders only generate extradata and have no concept of containers.

> How are you suggesting that a user could copy a chained ogg bitstream
> at the packet level if the header packets of subsequent chained
> bitstreams are not present?

Via the extradata update mechanism.
AV_PKT_DATA_NEW_EXTRADATA

> Opus is the only demuxer that will not surface those packets because
> of a single, codec-specific line of code:
> https://code.ffmpeg.org/FFmpeg/FFmpeg/src/branch/master/libavformat/oggdec.c#L242-L244
> 
> <---->
>      /* Chained files have extradata as a new packet */
>      if (codec == &ff_opus_codec)
>          os->header = -1;
> <---->
> 
> For all other codecs, the header is not parsed again on subsequent
> chained ogg bitstreams and the packets are passed down to the demuxer.
> 
> Here's an example output with vorbis. The logs are what's being
> returned by the demuxer and decoder:

Then the rest are wrong too. Container details should not have been 
leaked out into codec packets.

 > I don't understand what you are suggesting here.

Swallow the extra/metadata. Do not let OpusHead or their equivalents in 
other codecs inside packets.
Emit AV_PKT_DATA_NEW_EXTRADATA from the demuxer instead with the new 
data, which would contain the OpusHead unit (extradata).
In the codec, if a packet contains AV_PKT_DATA_NEW_EXTRADATA, reinit the 
decoder with it (we should do this in decode.c, but for now, doing it in 
opusdec.c would be good enough).

In the muxer, if a packet contains AV_PKT_DATA_NEW_EXTRADATA, write it 
to the bitstream.

This also removes the need to cache metadata updates. If a packet 
contains metadata, simply pass that into the AVFrame->metadata once the 
frame leaves the decoder in decode.c. The packet is not dereferenced 
yet, so you don't even need to cache it while the packet is being decoded.

The Ogg chained "spec" is batshit crazy. I was the one who merged 
support for it a few years ago. Even the authors admit it was a huge 
mistake. There isn't a single thing it did properly - from timestamp 
resets to recommending that stream IDs should be shuffled to requiring 
that users CRC each potential page to resync.
We shouldn't let mistakes of the past break abstraction and turn 
self-contained codecs into video game codecs which require extensive 
knowledge of container data.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA2FEA5F03F034464.asc
Type: application/pgp-keys
Size: 624 bytes
Desc: OpenPGP public key
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250214/09c14342/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250214/09c14342/attachment.sig>


More information about the ffmpeg-devel mailing list