[FFmpeg-devel] [PATCH] avcodec/webvttdec: Unescape HTML entities
Clément Bœsch
u at pkh.me
Thu Oct 8 21:46:23 CEST 2015
On Thu, Oct 08, 2015 at 05:20:52PM +0100, Ricardo Constantino wrote:
> Also fixes adjacent tags not being parsed correctly.
>
> Signed-off-by: Ricardo Constantino <wiiaboo at gmail.com>
> ---
> libavcodec/webvttdec.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/webvttdec.c b/libavcodec/webvttdec.c
> index 1284a17..dec4105 100644
> --- a/libavcodec/webvttdec.c
> +++ b/libavcodec/webvttdec.c
> @@ -37,11 +37,14 @@ static const struct {
> {"<b>", "{\\b1}"}, {"</b>", "{\\b0}"},
> {"<u>", "{\\u1}"}, {"</u>", "{\\u0}"},
> {"{", "\\{"}, {"}", "\\}"}, // escape to avoid ASS markup conflicts
> + {">", ">"}, {"<", "<"},
> + {"", ""}, {"", ""}, // FIXME: properly honor bidi marks
> + {"&", "&"}, {" ", " "},
> };
>
> static int webvtt_event_to_ass(AVBPrint *buf, const char *p)
> {
> - int i, skip = 0;
> + int i, skip, again = 0;
>
> while (*p) {
>
> @@ -51,13 +54,19 @@ static int webvtt_event_to_ass(AVBPrint *buf, const char *p)
> if (!strncmp(p, from, len)) {
> av_bprintf(buf, "%s", webvtt_tag_replace[i].to);
> p += len;
> + again = 1;
> break;
> }
> }
> if (!*p)
> break;
> + if (again) {
> + again = 0;
> + skip = 0;
> + continue;
> + }
>
> - if (*p == '<')
> + if (*p == '<' || *p == '&')
> skip = 1;
> else if (*p == '>')
I think you need to make the ';' stop skipping. Otherwise my guess is that
something like "Hello Ben&Jerry" is going to eat Jerry.
[...]
--
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151008/c1f92626/attachment.sig>
More information about the ffmpeg-devel
mailing list