[FFmpeg-devel] [PATCH]lavf/mpegts: Convert service name and service provider to utf-8

Jan Ekström jeebjp at gmail.com
Sun Feb 10 01:10:47 EET 2019


On Sat, Feb 9, 2019 at 7:39 PM Carl Eugen Hoyos <ceffmpeg at gmail.com> wrote:
>
> 2019-02-09 17:42 GMT+01:00, Marton Balint <cus at passwd.hu>:
> > On Sat, 9 Feb 2019, Carl Eugen Hoyos wrote:
> >
> >> From 9033f0a18727a7a576c4cc06b9985d6d922d46ad Mon Sep 17 00:00:00 2001
> >> From: Carl Eugen Hoyos <ceffmpeg at gmail.com>
> >> Date: Sat, 9 Feb 2019 00:49:51 +0100
> >> Subject: [PATCH] lavf/mpegts: Convert service_name and service_provider to
> >>  utf-8.
> >>
> >> Fixes ticket #6320.
> >> ---
> >>  libavformat/mpegts.c |   48
> >> ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 48 insertions(+)
> >>
> >> Diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c
> >> Index b04fd7b..1e27500 100644
> >> --- a/libavformat/mpegts.c
> >> +++ b/libavformat/mpegts.c
> >> @@ -37,6 +37,9 @@
> >>  #include "avio_internal.h"
> >>  #include "mpeg.h"
> >>  #include "isom.h"
> >> +#if CONFIG_ICONV
> >> +#include <iconv.h>
> >> +#endif
> >>
> >>  /* maximum size in which we look for synchronization if
> >>   * synchronization is lost */
> >> @@ -674,6 +677,51 @@ static char *getstr8(const uint8_t **pp, const
> >> uint8_t *p_end)
> >>          return NULL;
> >>      if (len > p_end - p)
> >>          return NULL;
> >> +#if CONFIG_ICONV
> >> +    if (len && *p < 0x20) {
> >> +        char iso8859[] = "ISO-8859-00";
> >> +        const char *encodings[] = {
> >> +            "ISO6937", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7",
> >> "ISO-8859-8",
> >> +            "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "",
> >> "ISO-8859-13",
> >> +            "ISO-8859-14", "ISO-8859-15", "", "", "", "",
> >> +            "", "ISO-10646", "KSC_5601", "GB2312", "ISO-10646", "UTF-8",
> >> "",
> >> +            "", "", "", "", "", "", "", "", ""
> >> +        };
> >> +        iconv_t cd;
> >> +        char *in, *out;
> >> +        size_t inlen = len - 1, outlen = inlen * 6 + 1;
> >> +        if (len >= 3 && p[0] == 0x10 && !p[1] && p[2] && p[2] <= 0xf &&
> >> p[2] != 0xc) {
> >> +            if (p[2] < 10) {
> >> +                iso8859[9] += p[2];
> >> +                iso8859[10] = 0;
> >> +            } else {
> >> +                iso8859[9]++;
> >> +                iso8859[10] += p[2] - 10;
> >> +            }
> >
> > I think this would be much more readable:
> >
> > char iso8859[16];
> > snprintf(iso8859, sizeof(iso8859), "ISO-8859-%d", p[2]);
>
> Definitely, new patch attached.
>

Idea-wise I like this. We generally try to promise that our metadata
is UTF-8, but with broadcast things we've not held up to that promise
too much :) . This fixes quite a bit of that, which is nice.

Checked that this doesn't seem to be breaking my future integration of
ARIB STD-B24 text decoding into UTF-8 looking at my set of samples on
hand. Just changes the place I'll have to integrate to as to not do a
double conversion.

In other words, good work.

Jan


More information about the ffmpeg-devel mailing list