[FFmpeg-devel] [PATCH 0/2] ARIB STD-B24 caption decoding using libaribb24

Jan Ekström jeebjp at gmail.com
Tue Feb 5 19:22:56 EET 2019


On Tue, Feb 5, 2019 at 1:56 AM Carl Eugen Hoyos <ceffmpeg at gmail.com> wrote:
>
> 2019-02-02 15:21 GMT+01:00, Jan Ekström <jeebjp at gmail.com>:
> > On Sat, Feb 2, 2019 at 3:55 PM Carl Eugen Hoyos <ceffmpeg at gmail.com> wrote:
> >>
> >> 2019-02-02 14:31 GMT+01:00, Jan Ekström <jeebjp at gmail.com>:
> >>
> >> > I will proceed to making a FATE test
> >>
> >> You cannot add fate tests for external libraries.
> >>
> >
> > Ouch. I was asked by Clement to make one which is why I wanted to do
> > it. Possibly he didn't understand that a library was being utilized.
> >
> >> Is there no chance of adding the aribb24 code to FFmpeg?
> >> The library looks small although it also contains an mpegts
> >> parser iiuc.
> >>
> >
> > As we generally want code to be LGPLv2.1+ when taking it in, we cannot
> > internalize libaribb24 as it is right now (LGPLv3 as of git master).
>
> We prefer LGPL2 but I don't think LGPL3 is unreasonable, we
> have a configure switch already...
>

At the point where we are "taking in" a library, in my opinion it
would only make sense to make it so that all of our users can utilize
it. Although personally LGPLv3 is fine for me.

> > All of the files do contain LGPLv2.1 license headers, so I did ask the
> > author to confirm and/or normalize them if possible, but there has
> > been very little activity around the library recently:
> > https://github.com/nkoriyama/aribb24/issues/9
>
> That seems like a very strong indication we should not link
> against the library.
>

It was picked by VLC, and thus packaged in at least Debian and Ubuntu.

This made it a prime candidate for something that was already
considered "acceptable".

But sure, it's not perfect and as is noted in that thread, VideoLAN
members are thinking of forking it due to lack of response from the
original author.

> >
> > There are currently two main things which aribb24 provides:
> > - Text conversion from the STD-B24 text encoding to UTF-8
> > - Decoding of STD-B24 caption regions into region structures while
> > converting the text into UTF-8 by utilizing the previous functionality
> >
> > The first feature one is available in a not-accepted-upstream iconv
> > module implemented under LGPLv2.1:
> > https://git.linuxtv.org/v4l-utils.git/tree/contrib/gconv/arib-std-b24.c
> > (it also includes encoding support if we want to later add support for
> > writing properly coded metadata into MPEG-TS in ARIB mode)
> >
> > The second feature I have found in an anonymous fork of FFmpeg (it
> > utilizes this custom iconv module for the text conversion):
> > https://github.com/0p1pp1/FFmpeg/blob/isdb-4.0/libavcodec/isdbsubdec.c
> >
> > Now, the problem is that this decoder has been written by a person who
> > is not in touch with upstream, and also seems to utilize an awful lot
> > of customized things in general with regards to ASS and other parts -
> > which is why I initially decided to pick the alternative of using an
> > already utilized/packaged open source library, for which the wrapper
> > would not be of too large size. Additionally, the output of this
> > library could then be compared against what this anonymous decoder
> > generates.
>
> Reasons and reasons for not adding the external dependency imo...
>

My initial idea was to get a reference going with libaribb24 which had
a defined interface and thus I didn't have to look too much into the
"how the sausage is done" part of it. After that I would have started
looking into these two things in order to see if it made sense to
bring them in in some way, without the functionality missing
altogether from FFmpeg for now.

To be honest, what I most disliked about your reply is that you do not
hint at all with regards to which way would you like for this to move
on towards. I am left in the air guessing what is acceptable for you.

But OK, let's give this a thought if libaribb24 is not considered to
be acceptable to link against. We want as much as possible bite-sized
changes to go in so that it is simpler for both the author as well as
the reviewer. Thus, even if we skip the decoder part from this patch
set, the following parts could already be reviewed and merged:
1. AVCodecIDs/descriptors and profiles in lavc (I used to have these
in a separate patch, but I was requested to merge them into the
decoder patch for review).
2. MPEG-TS Demuxing in lavf (It is not like this one is going to
change compared to when the decoder is finished).

This gets us one step forward, and not all of the code has to be
reviewed at the point where the following bigger dump of code will be
posted.

Then, for the following steps:
1. Text decoding.

The ARIB STD-B24 gconv module is indeed LGPLv2.1+ so we can start
taking it in. Thus the question would more become... where should it
go? libavutil? Or should we spin up a new library called
libbroadcasttext if we plan on taking in the DVB text format as well
(in the future)? Or is there a common way in the different (supported
by us) iconv systems to register new decoding and encoding functions?
FATE tests can then be made against this, and any differences to
libaribb24 verified and either kept or fixed on our side. The review
of this component should be done separate compared to the subtitle
decoding, as while one depends on the other, they are indeed separate
things.

2. Subtitle decoding.

The anonymous decoder needs to be taken into its barebones and
possibly partially rewritten. I would probably not include more
features than this initial version of this libaribb24 wrapper in the
initial version. And in one way, I would probably just put being able
to extract the textual data as a prime requirement for the first stage
of the decoder. More FATE tests.

3. Subtitle decoding, pt2+.

Bringing in rest of the features where feasible. Styling, positioning,
vertical text. More FATE tests.

4. Subtitle decoding, ptN.

In-band images and other things that require more fun things for ASS conversion.

To be very honest, I would not want to essentially throw what I've
made so far into the garbage bin, but this would be my basic idea of
getting an internal, LGPLv2.1+ decoder for this format. I cannot
promise if I will be able to do all this any time soon.

Jan


More information about the ffmpeg-devel mailing list