[FFmpeg-devel] [PATCH] libavformat/aac: Parse ID3 tags between ADTS frames.

Fri Feb 2 10:06:47 EET 2018

On Thu, Feb 1, 2018 at 10:02 PM, wm4 <nfxjfg at googlemail.com> wrote:
> On Thu,  1 Feb 2018 18:37:45 -0800
> rshaffer at tunein.com wrote:
>
>> From: Richard Shaffer <rshaffer at tunein.com>
>>
>> While rare, ID3 tags may be inserted between ADTS frames. This change enables
>> parsing them and setting the appropriate metadata updated event flag.
>> ---
>> I have encountered a streaming provider that I must support which does this.
>> There are indications that it isn't totally off the wall, and that it does
>> happen in the wild:
>> * https://www.w3.org/TR/mse-byte-stream-format-mpeg-audio/
>>   (See specifically sections 3 and 4.)
>> * https://github.com/video-dev/hls.js/issues/508
>>
>> That being said, the providers that do this are in the minority. It seems most
>> streaming providers will do one of the following:
>>  1. If using .aac segments inside of HLS, use the #EXTINF for text metadata and
>>     use an ID3 tag at the head of the segment for things like timing metadata.
>>  2. If streaming raw AAC over HTTP, use Icy metadata.
>>
>> Aside from comments on the code, I'd be interested in any opinion about whether
>> this is something that should or should not be supported in libavformat.
>>
>> Thanks,
>>
>> -Richard
>>
>>  libavformat/aacdec.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavformat/aacdec.c b/libavformat/aacdec.c
>> index 36d558ff54..5ec706bdc7 100644
>> --- a/libavformat/aacdec.c
>> +++ b/libavformat/aacdec.c
>> @@ -22,8 +22,10 @@
>>
>>  #include "libavutil/intreadwrite.h"
>>  #include "avformat.h"
>> +#include "avio_internal.h"
>>  #include "internal.h"
>>  #include "id3v1.h"
>> +#include "id3v2.h"
>>  #include "apetag.h"
>>
>>  #define ADTS_HEADER_SIZE 7
>> @@ -116,13 +118,52 @@ static int adts_aac_read_header(AVFormatContext *s)
>>      return 0;
>>  }
>>
>> +static int handle_id3(AVFormatContext *s, AVPacket *pkt)
>> +{
>> +    AVDictionary *metadata = NULL;
>> +    AVIOContext ioctx;
>> +    ID3v2ExtraMeta *id3v2_extra_meta = NULL;
>> +    int ret;
>> +
>> +    ret = av_append_packet(s->pb, pkt, ff_id3v2_tag_len(pkt->data) - pkt->size);
>> +    if (ret < 0) {
>> +        av_packet_unref(pkt);
>> +        return ret;
>> +    }
>> +
>> +    ffio_init_context(&ioctx, pkt->data, pkt->size, 0, NULL, NULL, NULL, NULL);
>> +    ff_id3v2_read_dict(&ioctx, &metadata, ID3v2_DEFAULT_MAGIC, &id3v2_extra_meta);
>> +    if ((ret = ff_id3v2_parse_priv_dict(&metadata, &id3v2_extra_meta)) < 0)
>> +        goto error;
>> +
>> +    if (metadata) {
>> +        if ((ret = av_dict_copy(&s->metadata, metadata, 0)) < 0)
>> +            goto error;
>
> AFAIK that would merge the existing metadata with the new one (i.e. not
> delete entries that are in the new metadata). But not sure if this is
> intended, or how exactly it should work. Intuitively I'd say it should
> completely replace previous ID3v2s.

That's right:
* If new metadata has a key that already exists in metadata, the old
value will be replaced.
* If new metadata has a key that doesn't exist in metadata, it will be added.
* Any other keys/values already in old metadata will be preserved.

This seems to be the behavior of Icy as well as other demuxers that
support updates. I think it's slightly better to update this way
rather than completely replacing the entire dictionary, but I don't
feel too strongly about it.

>
>> +        s->event_flags |= AVFMT_EVENT_FLAG_METADATA_UPDATED;
>> +    }
>> +
>> +error:
>> +    av_packet_unref(pkt);
>> +    ff_id3v2_free_extra_meta(&id3v2_extra_meta);
>> +    av_dict_free(&metadata);
>> +
>> +    return ret;
>> +}
>> +
>>  static int adts_aac_read_packet(AVFormatContext *s, AVPacket *pkt)
>>  {
>>      int ret, fsize;
>>
>> -    ret = av_get_packet(s->pb, pkt, ADTS_HEADER_SIZE);
>> +    ret = av_get_packet(s->pb, pkt, FFMAX(ID3v2_HEADER_SIZE, ADTS_HEADER_SIZE));
>> +
>> +    if (ret >= ID3v2_HEADER_SIZE && ff_id3v2_match(pkt->data, ID3v2_DEFAULT_MAGIC)) {
>> +        if ((ret = handle_id3(s, pkt)) >= 0)
>> +            ret = av_get_packet(s->pb, pkt, ADTS_HEADER_SIZE);
>> +    }
>> +
>>      if (ret < 0)
>>          return ret;
>> +
>>      if (ret < ADTS_HEADER_SIZE) {
>>          av_packet_unref(pkt);
>>          return AVERROR(EIO);
>> @@ -139,7 +180,7 @@ static int adts_aac_read_packet(AVFormatContext *s, AVPacket *pkt)
>>          return AVERROR_INVALIDDATA;
>>      }
>>
>> -    ret = av_append_packet(s->pb, pkt, fsize - ADTS_HEADER_SIZE);
>> +    ret = av_append_packet(s->pb, pkt, fsize - pkt->size);
>>      if (ret < 0)
>>          av_packet_unref(pkt);
>>
>
> I think that's the right approach. The demuxer should filter out such
> tags, and exporting them to the API user is a good idea too.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel