[FFmpeg-devel] [PATCH/RFC] subtitles: introduce ASS codec id and use it.

Nicolas George nicolas.george at normalesup.org
Fri Jan 4 11:51:45 CET 2013


Le quartidi 14 nivôse, an CCXXI, Clement Boesch a écrit :
> Currently, we have a AV_CODEC_ID_SSA, which matches the way the ASS/SSA
> markup is muxed in a standalone .ass/.ssa file. This means the AVPacket
> data starts with a "Dialogue:" string, followed by a timing information
> (start and end of the event as string) and a trailing CRLF after each
> line. One packet can contain several lines. We'll refer to this layout
> as "SSA" or "SSA lines".
> 
> In matroska, this markup is not stored as such: it has no "Dialogue:"
> prefix, it contains a ReadOrder field, the timing information is not in
> the payload, and it doesn't contain the trailing CRLF. See [1] for more
> info. We'll refer to this layout as "ASS".
> 
> Since we have only one common codec for both formats, the matroska
> demuxer is constructing an AVPacket following the "SSA lines" format.
> This causes several problems, so it was decided to change this into
> clean ASS packets.
> 
> Some insight about what is changed or unchanged in this commit:
> 
>   CODECS
>   ------
> 
>   - the decoding process still writes "SSA lines" markup inside the ass
>     fields of the subtitles rectangles (sub->rects[n]), which is still
>     the current common way of representing decoded subtitles markup. It
>     is meant to change later.
> 
>   - new ASS codec id
> 
>   - lavc/assdec: the "ass" decoder is renamed into "ssa" (instead of
>     "ass") for consistency with the codec id and allows to add a real
>     ass decoder. This ass decoder receives clean ASS lines (so it starts
>     with a ReadOrder, is followed by the Layer, etc). We make sure this
>     is decoded properly in a new ass-line rectangle of the decoded
>     subtitles (the ssa decoder OTOH is doing a simple straightforward
>     copy). Using the packet timing instead of data string makes sure the
>     ass-line now contains the appropriate timing.
> 
>   - lavc/assenc: just like the ass decoder, the "ssa" encoder is renamed
>     into "ssa" (instead of "ass") for consistency with the codec id, and
>     allows to add a real "ass" encoder.
> 
>     One important thing about this encoder is that it only supports one
>     ass rectangle: we could have put several dialogue events in the
>     AVPacket (separated by a \0 for instance) but this would have cause
>     trouble for the muxer which needs not only the start time, but also
>     the duration: typically, you have merged events with the same start
>     time (stored in the AVPacket->pts) but a different duration. At the
>     moment, only the matroska do the merge with the SSA-line codec.

That looks like exactly what I had in mind, thanks.

>     We will need to make sure all the decoder in the future can't add
>     more than one rectangle (and only one Dialogue line in it
>     obviously).

We will, at some point, need a real solution for codecs that produce several
packets for a frame, or decode several frames from one packet. The problem
is not only present here: PGS subtitles have it, APE audio codec have it,
etc.

> 
>   FORMATS
>   -------
> 
>   - lavf/assenc: the .ass/.ssa muxer can take both SSA and ASS packets.
>     In the case of ASS packets as input, it adds the timing based on the
>     AVPacket pts and duration, and mux it with "Dialogue:", trailing
>     CRLF, etc.
> 
>   - lavf/assdec: unchanged; it currently still only outputs SSA-lines
>     packets.

Ok.

>   - lavf/mkv: adding the codec id ASS makes the demuxer output them by
>     default without any "SSA-lines" reconstruction hack. The muxer is

Looks dangerous, from a compatibility point of view. What I had in mind was
this:

* Print a warning whenever the old codec is used.

* In the transition period, use AVFormatContext.subtitle_codec_id to tell
  the demuxer that new ASS is accepted.

>     updated to take by default ASS packets (and still supports the old
>     SSA packets).
> 
> [1]: http://www.matroska.org/technical/specs/subtitles/ssa.html
> ---
> Patch is still in RFC mode since there are a few FATE failures because of CLRF
> not present in the ASS header... sometimes. I need to investigate a bit more,
> and also run a few more tests since some stuff is left untested. But right now,
> I need some sleep. Comments welcome :)

Did you try to transcode a file with simultaneous events? My guess is it
will fail with the infamous non monotonic timestamps error. We need to find
a way of making the check lax (AVFMT_TS_NONSTRICT) for ASS streams.
Hopefully without adding "if (codec_id == ASS)" in the middle of lavf's
framework files.

I believe can propose something for that.

> ---
>  libavcodec/Makefile       |  2 ++
>  libavcodec/allcodecs.c    |  1 +
>  libavcodec/assdec.c       | 58 +++++++++++++++++++++++++++++++++++++++------
>  libavcodec/assenc.c       | 60 ++++++++++++++++++++++++++++++++++++++++++++---
>  libavcodec/avcodec.h      |  1 +
>  libavcodec/codec_desc.c   |  8 ++++++-
>  libavformat/assenc.c      | 35 ++++++++++++++++++++++++---
>  libavformat/matroska.c    |  4 ++++
>  libavformat/matroskadec.c |  3 ++-
>  libavformat/matroskaenc.c |  2 +-
>  10 files changed, 158 insertions(+), 16 deletions(-)
> 
> diff --git a/libavcodec/Makefile b/libavcodec/Makefile
> index 27c9a11..2a1e47f 100644
> --- a/libavcodec/Makefile
> +++ b/libavcodec/Makefile
> @@ -109,6 +109,8 @@ OBJS-$(CONFIG_AMV_ENCODER)             += mjpegenc.o mjpeg.o           \
>  OBJS-$(CONFIG_ANM_DECODER)             += anm.o
>  OBJS-$(CONFIG_ANSI_DECODER)            += ansi.o cga_data.o
>  OBJS-$(CONFIG_APE_DECODER)             += apedec.o
> +OBJS-$(CONFIG_SSA_DECODER)             += assdec.o ass.o ass_split.o
> +OBJS-$(CONFIG_SSA_ENCODER)             += assenc.o ass.o
>  OBJS-$(CONFIG_ASS_DECODER)             += assdec.o ass.o ass_split.o
>  OBJS-$(CONFIG_ASS_ENCODER)             += assenc.o ass.o
>  OBJS-$(CONFIG_ASV1_DECODER)            += asvdec.o asv.o mpeg12data.o
> diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c
> index 987b877..29d3e79 100644
> --- a/libavcodec/allcodecs.c
> +++ b/libavcodec/allcodecs.c
> @@ -442,6 +442,7 @@ void avcodec_register_all(void)
>      REGISTER_DECODER(VIMA,              vima);
>  
>      /* subtitles */
> +    REGISTER_ENCDEC (SSA,               ssa);
>      REGISTER_ENCDEC (ASS,               ass);
>      REGISTER_ENCDEC (DVBSUB,            dvbsub);
>      REGISTER_ENCDEC (DVDSUB,            dvdsub);
> diff --git a/libavcodec/assdec.c b/libavcodec/assdec.c
> index d790656..d7ce423 100644
> --- a/libavcodec/assdec.c
> +++ b/libavcodec/assdec.c
> @@ -41,7 +41,15 @@ static av_cold int ass_decode_init(AVCodecContext *avctx)
>      return 0;
>  }
>  
> -static int ass_decode_frame(AVCodecContext *avctx, void *data, int *got_sub_ptr,
> +static int ass_decode_close(AVCodecContext *avctx)
> +{
> +    ff_ass_split_free(avctx->priv_data);
> +    avctx->priv_data = NULL;
> +    return 0;
> +}
> +
> +#if CONFIG_SSA_DECODER
> +static int ssa_decode_frame(AVCodecContext *avctx, void *data, int *got_sub_ptr,
>                              AVPacket *avpkt)
>  {
>      const char *ptr = avpkt->data;
> @@ -64,19 +72,55 @@ static int ass_decode_frame(AVCodecContext *avctx, void *data, int *got_sub_ptr,
>      return avpkt->size;
>  }
>  
> -static int ass_decode_close(AVCodecContext *avctx)
> +AVCodec ff_ssa_decoder = {
> +    .name         = "ssa",
> +    .long_name    = NULL_IF_CONFIG_SMALL("SSA (SubStation Alpha) subtitle"),
> +    .type         = AVMEDIA_TYPE_SUBTITLE,
> +    .id           = AV_CODEC_ID_SSA,
> +    .init         = ass_decode_init,
> +    .decode       = ssa_decode_frame,
> +    .close        = ass_decode_close,
> +};
> +#endif
> +
> +#if CONFIG_ASS_DECODER
> +static int ass_decode_frame(AVCodecContext *avctx, void *data, int *got_sub_ptr,
> +                            AVPacket *avpkt)
>  {
> -    ff_ass_split_free(avctx->priv_data);
> -    avctx->priv_data = NULL;
> -    return 0;
> +    AVSubtitle *sub = data;
> +    const char *ptr = avpkt->data;

> +    const int ts_start    = av_rescale_q(avpkt->pts,      avctx->time_base, (AVRational){1,100});
> +    const int ts_duration = av_rescale_q(avpkt->duration, avctx->time_base, (AVRational){1,100});

static const AVRational ass_time_base = { 1, 100 };
and use it everywhere?

Also: check for overflows?

> +
> +    if (avpkt->size <= 0)
> +        return avpkt->size;
> +

> +    /* ReadOrder */
> +    while (*ptr && *ptr != ',')
> +        ptr++;
> +    if (*ptr == ',')
> +        ptr++;

strchr? And fail if it is not present?

> +
> +    /* Layer or Marked */
> +    //layer = ptr;
> +    while (*ptr && *ptr != ',')
> +        ptr++;

Factor that out?

> +
> +    if (*ptr == ',')

Error report?

> +        // FIXME: honor layer
> +        ff_ass_add_rect(sub, ptr, ts_start, ts_duration, 0);
> +
> +    *got_sub_ptr = avpkt->size > 0;
> +    return avpkt->size;
>  }
>  
>  AVCodec ff_ass_decoder = {
>      .name         = "ass",
> -    .long_name    = NULL_IF_CONFIG_SMALL("SSA (SubStation Alpha) subtitle"),
> +    .long_name    = NULL_IF_CONFIG_SMALL("ASS (Advanced SubStation Alpha) subtitle"),
>      .type         = AVMEDIA_TYPE_SUBTITLE,
> -    .id           = AV_CODEC_ID_SSA,
> +    .id           = AV_CODEC_ID_ASS,
>      .init         = ass_decode_init,
>      .decode       = ass_decode_frame,
>      .close        = ass_decode_close,
>  };
> +#endif
> diff --git a/libavcodec/assenc.c b/libavcodec/assenc.c
> index 50b89c0..781d39a 100644
> --- a/libavcodec/assenc.c
> +++ b/libavcodec/assenc.c
> @@ -22,10 +22,16 @@
>  #include <string.h>
>  
>  #include "avcodec.h"
> +#include "ass_split.h"
> +#include "ass.h"
>  #include "libavutil/avstring.h"
>  #include "libavutil/internal.h"
>  #include "libavutil/mem.h"
>  
> +typedef struct {
> +    int id; ///< current event id, ReadOrder field
> +} ASSEncodeContext;
> +
>  static av_cold int ass_encode_init(AVCodecContext *avctx)
>  {
>      avctx->extradata = av_malloc(avctx->subtitle_header_size + 1);
> @@ -41,15 +47,48 @@ static int ass_encode_frame(AVCodecContext *avctx,
>                              unsigned char *buf, int bufsize,
>                              const AVSubtitle *sub)
>  {
> +    ASSEncodeContext *s = avctx->priv_data;
>      int i, len, total_len = 0;
>  
>      for (i=0; i<sub->num_rects; i++) {
> +        char ass_line[2048];
> +        const char *ass = sub->rects[i]->ass;
> +
>          if (sub->rects[i]->type != SUBTITLE_ASS) {
>              av_log(avctx, AV_LOG_ERROR, "Only SUBTITLE_ASS type supported.\n");
>              return -1;
>          }
>  
> -        len = av_strlcpy(buf+total_len, sub->rects[i]->ass, bufsize-total_len);
> +        if (strncmp(ass, "Dialogue: ", 10)) {
> +            av_log(avctx, AV_LOG_ERROR, "AVSubtitle rectangle ass \"%s\""
> +                   " does not look like a SSA markup\n", ass);
> +            return AVERROR_INVALIDDATA;
> +        }
> +
> +        if (avctx->codec->id == AV_CODEC_ID_ASS) {
> +            long int layer;
> +            char *p;
> +

> +            if (i > 0) {
> +                av_log(avctx, AV_LOG_WARNING, "ASS encoder supports only one "
> +                       "ASS rectangle field. The following event will not be "
> +                       "encoded: \"%s\"\n", ass);
> +                continue;
> +            }

Please fail: discarding users' data in the middle of an encoding with just a
warning is evil.

> +
> +            ass += 10; // skip "Dialogue: "
> +            /* parse Layer field. If it's a Marked field, the content
> +             * will be "Marked=N" instead of the layer num, so we will
> +             * have layer=0, which is fine. */
> +            layer = strtol(ass, &p, 10);
> +            if (*p) p += strcspn(p, ",") + 1; // skip layer or marked
> +            if (*p) p += strcspn(p, ",") + 1; // skip start timestamp
> +            if (*p) p += strcspn(p, ",") + 1; // skip end timestamp

> +            snprintf(ass_line, sizeof(ass_line), "%d,%ld,%s", ++s->id, layer, p);

Missing overflow check.

> +            ass_line[strcspn(ass_line, "\r\n")] = 0;
> +            ass = ass_line;
> +        }
> +        len = av_strlcpy(buf+total_len, ass, bufsize-total_len);
>  
>          if (len > bufsize-total_len-1) {
>              av_log(avctx, AV_LOG_ERROR, "Buffer too small for ASS event.\n");
> @@ -62,11 +101,26 @@ static int ass_encode_frame(AVCodecContext *avctx,
>      return total_len;
>  }
>  
> -AVCodec ff_ass_encoder = {
> -    .name         = "ass",
> +#if CONFIG_SSA_ENCODER
> +AVCodec ff_ssa_encoder = {
> +    .name         = "ssa",
>      .long_name    = NULL_IF_CONFIG_SMALL("SSA (SubStation Alpha) subtitle"),
>      .type         = AVMEDIA_TYPE_SUBTITLE,
>      .id           = AV_CODEC_ID_SSA,
>      .init         = ass_encode_init,
>      .encode_sub   = ass_encode_frame,
> +    .priv_data_size = sizeof(ASSEncodeContext),
> +};
> +#endif
> +
> +#if CONFIG_ASS_ENCODER
> +AVCodec ff_ass_encoder = {
> +    .name         = "ass",
> +    .long_name    = NULL_IF_CONFIG_SMALL("ASS (Advanced SubStation Alpha) subtitle"),
> +    .type         = AVMEDIA_TYPE_SUBTITLE,
> +    .id           = AV_CODEC_ID_ASS,
> +    .init         = ass_encode_init,
> +    .encode_sub   = ass_encode_frame,
> +    .priv_data_size = sizeof(ASSEncodeContext),
>  };
> +#endif
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 30f153c..1276859 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -470,6 +470,7 @@ enum AVCodecID {
>      AV_CODEC_ID_MPL2       = MKBETAG('M','P','L','2'),
>      AV_CODEC_ID_VPLAYER    = MKBETAG('V','P','l','r'),
>      AV_CODEC_ID_PJS        = MKBETAG('P','h','J','S'),
> +    AV_CODEC_ID_ASS        = MKBETAG('A','S','S',' '),  ///< ASS such as muxed in Matroska (no timings info for instance)

"such as" does not sound right here.

>  
>      /* other specific kind of codecs (generally used for attachments) */
>      AV_CODEC_ID_FIRST_UNKNOWN = 0x18000,           ///< A dummy ID pointing at the start of various fake codecs.
> diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c
> index d721780..84306c0 100644
> --- a/libavcodec/codec_desc.c
> +++ b/libavcodec/codec_desc.c
> @@ -2363,10 +2363,16 @@ static const AVCodecDescriptor codec_descriptors[] = {
>          .long_name = NULL_IF_CONFIG_SMALL("XSUB"),
>      },
>      {
> +        .id        = AV_CODEC_ID_ASS,
> +        .type      = AVMEDIA_TYPE_SUBTITLE,
> +        .name      = "ass",
> +        .long_name = NULL_IF_CONFIG_SMALL("ASS (Advanced SSA) subtitle"),
> +    },
> +    {
>          .id        = AV_CODEC_ID_SSA,
>          .type      = AVMEDIA_TYPE_SUBTITLE,
>          .name      = "ssa",
> -        .long_name = NULL_IF_CONFIG_SMALL("SSA (SubStation Alpha) / ASS (Advanced SSA) subtitle"),
> +        .long_name = NULL_IF_CONFIG_SMALL("SSA (SubStation Alpha) subtitle"),
>      },
>      {
>          .id        = AV_CODEC_ID_MOV_TEXT,
> diff --git a/libavformat/assenc.c b/libavformat/assenc.c
> index bda507d..b42b9da 100644
> --- a/libavformat/assenc.c
> +++ b/libavformat/assenc.c
> @@ -20,9 +20,11 @@
>   */
>  
>  #include "avformat.h"
> +#include "internal.h"
>  
>  typedef struct ASSContext{
>      unsigned int extra_index;
> +    int write_ts; // 0: ssa (timing in payload), 1: ass (matroska like)
>  }ASSContext;
>  
>  static int write_header(AVFormatContext *s)
> @@ -31,10 +33,13 @@ static int write_header(AVFormatContext *s)
>      AVCodecContext *avctx= s->streams[0]->codec;
>      uint8_t *last= NULL;
>  
> -    if(s->nb_streams != 1 || avctx->codec_id != AV_CODEC_ID_SSA){
> +    if (s->nb_streams != 1 || (avctx->codec_id != AV_CODEC_ID_SSA &&
> +                               avctx->codec_id != AV_CODEC_ID_ASS)) {
>          av_log(s, AV_LOG_ERROR, "Exactly one ASS/SSA stream is needed.\n");
>          return -1;
>      }
> +    ass->write_ts = avctx->codec_id == AV_CODEC_ID_ASS;
> +    avpriv_set_pts_info(s->streams[0], 64, 1, 100);
>  
>      while(ass->extra_index < avctx->extradata_size){
>          uint8_t *p  = avctx->extradata + ass->extra_index;
> @@ -57,7 +62,31 @@ static int write_header(AVFormatContext *s)
>  
>  static int write_packet(AVFormatContext *s, AVPacket *pkt)
>  {
> -    avio_write(s->pb, pkt->data, pkt->size);
> +    ASSContext *ass = s->priv_data;
> +
> +    if (ass->write_ts) {
> +        long int layer;
> +        char *p;
> +        int64_t start = pkt->pts;
> +        int64_t end   = start + pkt->duration;
> +        int hh1, mm1, ss1, ms1;
> +        int hh2, mm2, ss2, ms2;
> +
> +        p = pkt->data + strcspn(pkt->data, ",") + 1; // skip ReadOrder
> +        layer = strtol(p, &p, 10);
> +        if (*p == ',')
> +            p++;
> +        hh1 = (int)(start / 360000);    mm1 = (int)(start / 6000) % 60;
> +        hh2 = (int)(end   / 360000);    mm2 = (int)(end   / 6000) % 60;
> +        ss1 = (int)(start / 100) % 60;  ms1 = (int)(start % 100);
> +        ss2 = (int)(end   / 100) % 60;  ms2 = (int)(end   % 100);
> +        if (hh1 > 9) hh1 = 9, mm1 = 59, ss1 = 59, ms1 = 99;
> +        if (hh2 > 9) hh2 = 9, mm2 = 59, ss2 = 59, ms2 = 99;
> +        avio_printf(s->pb, "Dialogue: %ld,%d:%02d:%02d.%02d,%d:%02d:%02d.%02d,%s\r\n",
> +                    layer, hh1, mm1, ss1, ms1, hh2, mm2, ss2, ms2, p);
> +    } else {
> +        avio_write(s->pb, pkt->data, pkt->size);
> +    }
>  
>      avio_flush(s->pb);
>  
> @@ -81,7 +110,7 @@ AVOutputFormat ff_ass_muxer = {
>      .mime_type      = "text/x-ssa",
>      .extensions     = "ass,ssa",
>      .priv_data_size = sizeof(ASSContext),
> -    .subtitle_codec = AV_CODEC_ID_SSA,
> +    .subtitle_codec = AV_CODEC_ID_ASS,
>      .write_header   = write_header,
>      .write_packet   = write_packet,
>      .write_trailer  = write_trailer,
> diff --git a/libavformat/matroska.c b/libavformat/matroska.c
> index 64d0a45..37c56b9 100644
> --- a/libavformat/matroska.c
> +++ b/libavformat/matroska.c
> @@ -57,6 +57,10 @@ const CodecTags ff_mkv_codec_tags[]={
>      {"S_TEXT/UTF8"      , AV_CODEC_ID_TEXT},
>      {"S_TEXT/UTF8"      , AV_CODEC_ID_SRT},
>      {"S_TEXT/ASCII"     , AV_CODEC_ID_TEXT},
> +    {"S_TEXT/ASS"       , AV_CODEC_ID_ASS},
> +    {"S_TEXT/SSA"       , AV_CODEC_ID_ASS},
> +    {"S_ASS"            , AV_CODEC_ID_ASS},
> +    {"S_SSA"            , AV_CODEC_ID_ASS},
>      {"S_TEXT/ASS"       , AV_CODEC_ID_SSA},
>      {"S_TEXT/SSA"       , AV_CODEC_ID_SSA},
>      {"S_ASS"            , AV_CODEC_ID_SSA},
> diff --git a/libavformat/matroskadec.c b/libavformat/matroskadec.c
> index feb7b84..de028e3 100644
> --- a/libavformat/matroskadec.c
> +++ b/libavformat/matroskadec.c
> @@ -1792,7 +1792,8 @@ static int matroska_read_header(AVFormatContext *s)
>              st->need_parsing = AVSTREAM_PARSE_HEADERS;
>          } else if (track->type == MATROSKA_TRACK_TYPE_SUBTITLE) {
>              st->codec->codec_type = AVMEDIA_TYPE_SUBTITLE;
> -            if (st->codec->codec_id == AV_CODEC_ID_SSA)
> +            if (st->codec->codec_id == AV_CODEC_ID_SSA ||
> +                st->codec->codec_id == AV_CODEC_ID_ASS)
>                  matroska->contains_ssa = 1;
>          }
>      }
> diff --git a/libavformat/matroskaenc.c b/libavformat/matroskaenc.c
> index 1840f90..e90cf04 100644
> --- a/libavformat/matroskaenc.c
> +++ b/libavformat/matroskaenc.c
> @@ -1369,7 +1369,7 @@ AVOutputFormat ff_matroska_muxer = {
>      .write_trailer     = mkv_write_trailer,
>      .flags             = AVFMT_GLOBALHEADER | AVFMT_VARIABLE_FPS |
>                           AVFMT_TS_NONSTRICT,
> -    .subtitle_codec    = AV_CODEC_ID_SSA,
> +    .subtitle_codec    = AV_CODEC_ID_ASS,
>      .query_codec       = mkv_query_codec,
>  };
>  #endif
> -- 
> 1.8.0.3
> 
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20130104/e0edfa03/attachment.asc>


More information about the ffmpeg-devel mailing list