[FFmpeg-devel] [PATCH 2/2] avcodec/h264_metadata: Add feature to repeat x264 SEI

Mark Thompson sw at jkqxz.net
Tue Sep 24 18:55:15 EEST 2019


On 11/09/2019 23:18, Andreas Rheinhardt wrote:
> x264 adds a user data unregistered SEI containing its version number and
> encoding parameters to the first access unit. This SEI is mostly
> informative, but it can also be used by a decoder (like FFmpeg's H.264
> decoder) to decide whether to use workarounds for certain bugs in old
> versions of x264. This in particular affects 4:4:4 content with the 8x8dct
> option set. For such files it is imperative to know the x264 version and
> there were scenarios where this information would be lost or not
> gathered: The former can happen when the file is split, the latter can
> happen when one doesn't decode from the very beginning.
> 
> Therefore this commit adds a feature to insert the x264 SEI in every
> keyframe, but only to access units that don't already contain an x264
> SEI. Changing x264 SEIs (which can be produced by concatenating x264
> encodes) are supported.
> 
> Fixes ticket #8126.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
> ---
> I vaguely remember that in mp4 the SEI can also be in another
> out-of-band structure. But creating mp4 files with an x264 binary with
> lsmash suppport only had the SEI in the first access unit, so I didn't
> do anything special for this.

I thought the more common case of this problem was when the user had stripped all of the SEI for some reason?  I agree that copying the first SEI fixes the setup for the ticket and related splitting cases, but it confuses the case where the SEI has been stripped because the option looks relevant but it ends up doing nothing.

Perhaps it would be clearer how to use it if it allowed inserting a new x264 SEI with a specific build number chosen by the user?  So maybe something like "x264_sei_build" as an enum with values like "existing" (maybe 0, doing what this patch does), "bad_444_8x8dct" (as 150).

>  doc/bitstream_filters.texi     |  6 +++
>  libavcodec/h264_metadata_bsf.c | 80 ++++++++++++++++++++++++++++++++++
>  2 files changed, 86 insertions(+)
> 
> diff --git a/doc/bitstream_filters.texi b/doc/bitstream_filters.texi
> index 50a1679fc7..5f1eda6369 100644
> --- a/doc/bitstream_filters.texi
> +++ b/doc/bitstream_filters.texi
> @@ -273,6 +273,12 @@ possibly separated by hyphens, and the string can be anything.
>  For example, @samp{086f3693-b7b3-4f2c-9653-21492feee5b8+hello} will
>  insert the string ``hello'' associated with the given UUID.
>  
> + at item x264_sei
> +Repeat the x264 SEI message containing the x264 version and encoding
> +parameters used in every keyframe.  This allows to split the file while
> +retaining this information and also fixes decoding problems with
> +spec-incompliant content created by old versions of x264.

If this is going to stay in its current form then I think this needs to make clear that it will do nothing if the stream does not already contain one of the SEI messages.

> +
>  @item delete_filler
>  Deletes both filler NAL units and filler SEI messages.
>  
> diff --git a/libavcodec/h264_metadata_bsf.c b/libavcodec/h264_metadata_bsf.c
> index 5de74be9d6..94b6934ba3 100644
> --- a/libavcodec/h264_metadata_bsf.c
> +++ b/libavcodec/h264_metadata_bsf.c
> @@ -77,6 +77,9 @@ typedef struct H264MetadataContext {
>  
>      const char *sei_user_data;
>  
> +    H264RawSEIPayload x264_sei;
> +    int x264_sei_status;
> +
>      int delete_filler;
>  
>      int display_orientation;
> @@ -275,6 +278,26 @@ static int h264_metadata_update_sps(AVBSFContext *bsf,
>      return 0;
>  }
>  
> +static int h264_metadata_copy_x264_sei(H264RawSEIPayload *dst,
> +                                       const H264RawSEIPayload *src)
> +{
> +    H264RawSEIUserDataUnregistered *udu = &dst->payload.user_data_unregistered;
> +    AVBufferRef *data_ref;
> +
> +    av_buffer_unref(&udu->data_ref);
> +
> +    data_ref = av_buffer_ref(src->payload.user_data_unregistered.data_ref);
> +    if (!data_ref) {
> +        memset(dst, 0, sizeof(*dst));
> +        return AVERROR(ENOMEM);
> +    }
> +
> +    memcpy(dst, src, sizeof(*dst));
> +    udu->data_ref = data_ref;
> +
> +    return 0;
> +}
> +
>  static int h264_metadata_filter(AVBSFContext *bsf, AVPacket *pkt)
>  {
>      H264MetadataContext *ctx = bsf->priv_data;
> @@ -416,6 +439,58 @@ static int h264_metadata_filter(AVBSFContext *bsf, AVPacket *pkt)
>          }
>      }
>  
> +    if (ctx->x264_sei_status) {
> +        int has_x264_sei = 0;
> +
> +        for (i = 0; i < au->nb_units; i++) {
> +            CodedBitstreamUnit *unit = &au->units[i];
> +            H264RawSEI *sei;
> +
> +            if (unit->type != H264_NAL_SEI)
> +                continue;
> +
> +            sei = unit->content;
> +
> +            for (j = 0; j < sei->payload_count; j++) {
> +                H264RawSEIPayload *payload = &sei->payload[j];
> +                H264RawSEIUserDataUnregistered *udu;
> +                int e, build;
> +
> +                if (payload->payload_type != H264_SEI_TYPE_USER_DATA_UNREGISTERED)
> +                    continue;
> +
> +                udu = &payload->payload.user_data_unregistered;
> +                // The following is safe because udu->data always
> +                // contains zeroed padding at the end.
> +                e = sscanf(udu->data, "x264 - core %d", &build);
> +                if (e == 1 && build > 0) {
> +                    err = h264_metadata_copy_x264_sei(&ctx->x264_sei,
> +                                                      payload);
> +                    if (err < 0) {
> +                        ctx->x264_sei_status = 1;
> +                        goto fail;
> +                    }
> +                    ctx->x264_sei_status = -1;
> +                    has_x264_sei = 1;
> +                }
> +            }
> +        }
> +
> +        if (!has_x264_sei && ctx->x264_sei_status < 0
> +                          && pkt->flags & AV_PKT_FLAG_KEY) {

I'm not sure I really want to trust a key frame marker to tell you where to put this, given the use on raw streams.  Not sure what the right answer is, though - the existing sei_user_data option goes for everything with an SPS, but that's wrong here for cases with only out-of-band extradata.  Is there any better combination of signals we could use?

> +            H264RawSEIPayload payload = { 0 };
> +
> +            err = h264_metadata_copy_x264_sei(&payload,
> +                                              &ctx->x264_sei);
> +            if (err < 0)
> +                goto fail;
> +
> +            err = ff_cbs_h264_add_sei_message(ctx->cbc, au, &payload);
> +            if (err < 0)
> +                goto fail;
> +        }
> +    }
> +
>      if (ctx->delete_filler) {
>          for (i = au->nb_units - 1; i >= 0; i--) {
>              if (au->units[i].type == H264_NAL_FILLER_DATA) {
> @@ -616,6 +691,8 @@ static void h264_metadata_close(AVBSFContext *bsf)
>  {
>      H264MetadataContext *ctx = bsf->priv_data;
>  
> +    av_buffer_unref(&ctx->x264_sei.payload.user_data_unregistered.data_ref);
> +
>      ff_cbs_fragment_free(ctx->cbc, &ctx->access_unit);
>      ff_cbs_close(&ctx->cbc);
>  }
> @@ -684,6 +761,9 @@ static const AVOption h264_metadata_options[] = {
>      { "sei_user_data", "Insert SEI user data (UUID+string)",
>          OFFSET(sei_user_data), AV_OPT_TYPE_STRING, { .str = NULL }, .flags = FLAGS },
>  
> +    { "x264_sei", "Repeat x264 SEI in every keyframe",
> +        OFFSET(x264_sei_status), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, FLAGS },
> +
>      { "delete_filler", "Delete all filler (both NAL and SEI)",
>          OFFSET(delete_filler), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, FLAGS},
>  
> 

Thanks,

- Mark


More information about the ffmpeg-devel mailing list