[FFmpeg-devel] [PATCH v2 8/8] aacdec: add a decoder for AAC USAC (xHE-AAC)

Lynne dev at lynne.ee
Tue May 21 22:52:29 EEST 2024


On 21/05/2024 21:40, Marton Balint wrote:
> 
> 
> On Tue, 21 May 2024, Lynne via ffmpeg-devel wrote:
> 
>> On 21/05/2024 09:16, Marton Balint wrote:
>>>
>>>
>>>  On Sun, 19 May 2024, Lynne via ffmpeg-devel wrote:
>>>
>>>>  On 19/05/2024 21:39, Marton Balint wrote:
>>>>>
>>>>>
>>>>>   On Sun, 19 May 2024, Lynne via ffmpeg-devel wrote:
>>>>>
>>>>>>   This commit adds a decoder for the frequency-domain part of USAC.
>>>>>>
>>>>>   [...]
>>>>>
>>>>>>
>>>>>>   +/* Finish later */
>>>>>>   +static const enum AVChannel usac_ch_pos_to_av[64] = {
>>>>>>   +    [0] = AV_CHAN_FRONT_LEFT,
>>>>>>   +    [1] = AV_CHAN_FRONT_RIGHT,
>>>>>>   +    [2] = AV_CHAN_FRONT_CENTER,
>>>>>>   +    [3] = AV_CHAN_LOW_FREQUENCY,
>>>>>>   +    [4] = AV_CHAN_BACK_LEFT, // unsure
>>>>>>   +    [5] = AV_CHAN_BACK_RIGHT, // unsure
>>>>>>   +    [6] = AV_CHAN_FRONT_LEFT_OF_CENTER,
>>>>>>   +    [7] = AV_CHAN_FRONT_RIGHT_OF_CENTER,
>>>>>>   +    [8] = 0, /* rear surround left is missing */
>>>>>>   +    [9] = 0, /* rear surround right is missing */
>>>>>>   +    [10] = AV_CHAN_BACK_CENTER,
>>>>>>   +    [11] = AV_CHAN_SURROUND_DIRECT_LEFT,
>>>>>>   +    [12] = AV_CHAN_SURROUND_DIRECT_RIGHT,
>>>>>>   +    [13] = AV_CHAN_SIDE_LEFT, // fairly sure
>>>>>>   +    [14] = AV_CHAN_SIDE_RIGHT, // fairly sure
>>>>>>   +    [15] = AV_CHAN_WIDE_LEFT, // somewhat confident
>>>>>>   +    [16] = AV_CHAN_WIDE_RIGHT, // somewhat confident
>>>>>>   +    [17] = AV_CHAN_TOP_FRONT_LEFT,
>>>>>>   +    [18] = AV_CHAN_TOP_FRONT_RIGHT,
>>>>>>   +    [19] = AV_CHAN_TOP_FRONT_CENTER,
>>>>>>   +    [20] = AV_CHAN_TOP_BACK_LEFT,
>>>>>>   +    [21] = AV_CHAN_TOP_BACK_RIGHT,
>>>>>>   +    [22] = AV_CHAN_TOP_BACK_CENTER,
>>>>>>   +    [23] = AV_CHAN_TOP_SIDE_LEFT,
>>>>>>   +    [24] = AV_CHAN_TOP_SIDE_RIGHT,
>>>>>>   +    [25] = AV_CHAN_TOP_CENTER,
>>>>>>   +    [26] = AV_CHAN_LOW_FREQUENCY, // actually LFE2
>>>>>>   +    [27] = AV_CHAN_BOTTOM_FRONT_LEFT,
>>>>>>   +    [28] = AV_CHAN_BOTTOM_FRONT_RIGHT,
>>>>>>   +    [29] = AV_CHAN_BOTTOM_FRONT_CENTER,
>>>>>>   +    [30] = 0, /* top left surround is missing */
>>>>>>   +    [31] = 0, /* top right surround is missing */
>>>>>>   +};
>>>>>
>>>>>   Some comment would be nice about the source of this table (which
>>>>>  document,
>>>>>   which table).
>>>>>
>>>>>   It looks very similar to the ISO channel positons used in 
>>>>> mov_chan. I
>>>>>   think we follow this mapping in most cases:
>>>>>
>>>>>   Left  Surround is SIDE_LEFT
>>>>>   Right Surround is SIDE_RIGHT
>>>>>   Rear Surround Left  is BACK_LEFT
>>>>>   Rear Surround Right is BACK_RIGHT
>>>>>
>>>>>   So in your table [4] and [5] should be SIDE, [8] and [9] should be
>>>>>  BACK.
>>>>>   [26] can be AV_CHAN_LOW_FREQUENCY_2, we do have that.
>>>>>
>>>>>   Yes, Left/Right Surround and Left/Right Side Surround will be the 
>>>>> same,
>>>>>   but those are not present in commonly used layouts at the same time.
>>>>>
>>>>>   Regards,
>>>>>   Marton
>>>>>   _______________________________________________
>>>>>   ffmpeg-devel mailing list
>>>>>   ffmpeg-devel at ffmpeg.org
>>>>>   https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>>>
>>>>>   To unsubscribe, visit link above, or email
>>>>>   ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>>>>
>>>>  Source of the table is ISO/IEC 23003-3, Table 74 — bsOutputChannelPos:
>>>>
>>>>  0 L left front FL front left
>>>>  1 R right front FR front right
>>>>  2 C center front FC front centre
>>>>  3 LFE low frequency enhancement LFE1 low frequency effects-1
>>>>  4 Ls left surround LS left surround
>>>>  5 Rs right surround RS right surround
>>>>  6 Lc left front center FLc front left centre
>>>>  7 Rc right front center FRc front right centre
>>>>  8 Lsr rear surround left BL back left
>>>>  9 Rsr rear surround right BR back right
>>>>  10 Cs rear center BC back centre
>>>>  11 Lsd left surround direct LSd left surround direct
>>>>  12 Rsd right surround direct RSd right surround direct
>>>>  13 Lss left side surround SL side left
>>>>  14 Rss right side surround SR side right
>>>>  15 Lw left wide front FLw front left wide
>>>>  16 Rw right wide front FRw front right wide
>>>>  17 Lv left front vertical height TpFL top front left
>>>>  18 Rv right front vertical height TpFR top front right
>>>>  19 Cv center front vertical height TpFC top front centre
>>>>  20 Lvr left surround vertical height rear TpBL top back left
>>>>  21 Rvr right surround vertical height rear TpBR top back right
>>>>  22 Cvr center vertical height rear TpBC top back centre
>>>>  23 Lvss left vertical height side surround TpSiL top side left
>>>>  24 Rvss right vertical height side surround TpSiR top side right
>>>>  25 Ts top center surround TpC top centre
>>>>  26 LFE2 low frequency enhancement 2 LFE2 low frequency effects-2
>>>>  27 Lb left front vertical bottom BtFL bottom front left
>>>>  28 Rb right front vertical bottom BtFR bottom front right
>>>>  29 Cb center front vertical bottom BtFC bottom front centre
>>>>  30 Lvs left vertical height surround TpLS top left surround
>>>>  31 Rvs right vertical height surround TpRS top right surround
>>>>
>>>>  Third field is "Loudspeaker position", last field is "Loudspeaker
>>>>  position according to IEC 100/1706/CDV/IEC 62574 (TC100)", each 
>>>> prefixed
>>>>  with an abbreviation.
>>>>
>>>>  I've added the source to the table comment in the code.
>>>>
>>>>  I've also fixed the SIDE/BACK/LFE2 issue in my github repo I linked
>>>>  earlier.
>>>
>>>  Thanks. Later in the code when you actually use this I can see that you
>>>  are creating a native layout:
>>>
>>>>  +    channel_config_idx = get_bits(gb, 5); /* 
>>>> channelConfigurationIndex
>>>>  */
>>>>  +    if (!channel_config_idx) {
>>>>  +        /* UsacChannelConfig() */
>>>>  +        uint8_t channel_pos[64];
>>>>  +        uint8_t nb_channels = get_escaped_value(gb, 5, 8, 16); /*
>>>>  numOutChannels */
>>>>  +        if (nb_channels >= 64)
>>>>  +            return AVERROR(EINVAL);
>>>>  +
>>>>  +        av_channel_layout_uninit(&ac->oc[1].ch_layout);
>>>>  +        for (int i = 0; i < nb_channels; i++)
>>>>  +            channel_pos[i] = get_bits(gb, 5); /* 
>>>> bsOutputChannelPos */
>>>>  +
>>>>  +        ac->oc[1].ch_layout.order = AV_CHANNEL_ORDER_NATIVE;
>>>>  +        ac->oc[1].ch_layout.nb_channels = nb_channels;
>>>>  +        ac->oc[1].ch_layout.u.mask = 0;
>>>>  +
>>>>  +        for (int i = 0; i < nb_channels; i++)
>>>>  +            ac->oc[1].ch_layout.u.mask |= 1 <<
>>>>  usac_ch_pos_to_av[channel_pos[i]];
>>>>  +
>>>>  +        av_channel_layout_copy(&avctx->ch_layout, 
>>>> &ac->oc[1].ch_layout);
>>>>  +    } else {
>>>
>>>  Probably you should create a custom layout here, because the 
>>> channels are
>>>  not necessary in native order. We already have a relatively simple 
>>> way to
>>>  do that and to fall back to native layouts if possible, here is an 
>>> example
>>>  copied from mov_chan:
>>>
>>>  ret = av_channel_layout_custom_init(ch_layout, channels);
>>>  if (ret < 0)
>>>      return ret;
>>>  for (i = 0; i < channels; i++) {
>>>       enum AVChannel id = layout_map[i].id;
>>>       ch_layout->u.map[i].id = (id != AV_CHAN_NONE ? id : 
>>> AV_CHAN_UNKNOWN);
>>> }
>>>  return av_channel_layout_retype(ch_layout, 0,
>>>  AV_CHANNEL_LAYOUT_RETYPE_FLAG_CANONICAL);
>>>
>>>  So you should adapt this accodingly to aac.
>>>
>>>  Regards,
>>>  Marton
>>>  _______________________________________________
>>>  ffmpeg-devel mailing list
>>>  ffmpeg-devel at ffmpeg.org
>>>  https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>
>>>  To unsubscribe, visit link above, or email
>>>  ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>>
>> According to the spec:
>>
>>>  In case of multiple channel elements the index i of
>>>  bsOutputChannelPos[i] indicates the position in which the channel
>>>  appears in the bitstream.
>>
>> So the channels will always be in native order, as far as I understand.
>>
> 
> AV_CHANNEL_ORDER_NATIVE expects channels in the order of the channel IDs 
> in the AVChannel enum. Surely that is not the case here, unless the 
> decoder reorders the channels. Or what if an output position appears 
> multiple times?
> 
> Regards,
> Marton
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


It should be the case here, we shouldn't need reordering as NATIVE just 
lets you specify what order the elements appear in the bitstream.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA2FEA5F03F034464.asc
Type: application/pgp-keys
Size: 624 bytes
Desc: OpenPGP public key
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240521/9543133a/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240521/9543133a/attachment.sig>


More information about the ffmpeg-devel mailing list