[FFmpeg-devel] [PATCH v3 1/2] lavc/ccaption_dec: do not ignore repeated character commands

Thierry Foucu tfoucu at gmail.com
Mon Apr 25 20:03:31 CEST 2016


On Tue, Apr 19, 2016 at 10:04 AM, Aman Gupta <aman at tmm1.net> wrote:

> This is a tricky one.. I tried your sample in VLC, and it has the same
> issue as the latest version of ffmpeg.
>
> The previous behavior of ffmpeg can be restored with the following patch:
>
> diff --git a/libavcodec/ccaption_dec.c b/libavcodec/ccaption_dec.c
> index 3b15149..9eff843 100644
> --- a/libavcodec/ccaption_dec.c
> +++ b/libavcodec/ccaption_dec.c
> @@ -712,7 +712,6 @@ static void process_cc608(CCaptionSubContext *ctx,
> int64_t pts, uint8_t hi, uint
>      } else if (hi >= 0x20) {
>          /* Standard characters (always in pairs) */
>          handle_char(ctx, hi, lo, pts);
> -        ctx->prev_cmd[0] = ctx->prev_cmd[1] = 0;
>      } else {
>          /* Ignoring all other non data code */
>          ff_dlog(ctx, "Unknown command 0x%hhx 0x%hhx\n", hi, lo);
>

Thanks for looking into it.

Do you think it will make sense to have a flag to turn on/off such feature?
If so, i'm wiling to do the patch.


>
> However, I've encountered a number of samples where the same character is
> legitimately repeated and is not supposed to be skipped.
>
> For instance, the samples on
> http://hackipedia.org/ATSC/EIA-608%20samples/EIA-608%20character%20set%20test/
> (https://www.youtube.com/watch?v=8TZLxPdC3hk) repeat the character "."
> multiple times to show correct spacing.
>
> Similarly, there are an endless number of words in the english language
> with repeated characters, such as "ss" in endless and "rr" in correct.
>
> It is unclear to me how the decoder is supposed to distinguish between
> characters that are meant to be displayed twice, vs streams that repeat
> every ascii character unconditionally.
>
> Further, in your sample it appears that every command is repeated not
> twice (as is common in many streams for special character sets and other
> command, see
> http://hackipedia.org/ATSC/EIA-608%20samples/EIA-608%20character%20set%20test/README.TXT),
> but three times.
>
> Aman
>
> On Mon, Apr 18, 2016 at 1:01 PM, Aman Gupta <aman at tmm1.net> wrote:
>
>> Please send me the sample and I will try to fix the issue.
>>
>> Aman
>>
>> On Mon, Apr 18, 2016 at 1:22 PM Thierry Foucu <tfoucu at gmail.com> wrote:
>>
>>> Hi all
>>>
>>> On Sun, Feb 14, 2016 at 6:11 PM, Aman Gupta <ffmpeg at tmm1.net> wrote:
>>>
>>>> From: Aman Gupta <aman at tmm1.net>
>>>>
>>>> control codes in a cc stream can be repeated, and must be ignored.
>>>> however, repeated characters must not be ignored. the code attempted to
>>>> wipe prev_cmd in handle_char to allow repeated characters to be
>>>> processed, but prev_cmd would previously get reset _after_ handle_char()
>>>>
>>>> i also moved the prev_cmd reset out from handle_char() so it can be
>>>> re-used for special character sets, which _must_ be ignored when
>>>> repeated.
>>>> ---
>>>>  libavcodec/ccaption_dec.c | 19 ++++++++++---------
>>>>  1 file changed, 10 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/libavcodec/ccaption_dec.c b/libavcodec/ccaption_dec.c
>>>> index 790f071..5fb2ec6 100644
>>>> --- a/libavcodec/ccaption_dec.c
>>>> +++ b/libavcodec/ccaption_dec.c
>>>> @@ -484,9 +484,6 @@ static void handle_char(CCaptionSubContext *ctx,
>>>> char hi, char lo, int64_t pts)
>>>>      if (ctx->mode != CCMODE_POPON)
>>>>          ctx->screen_touched = 1;
>>>>
>>>> -    /* reset prev command since character can repeat */
>>>> -    ctx->prev_cmd[0] = 0;
>>>> -    ctx->prev_cmd[1] = 0;
>>>>      if (lo)
>>>>         ff_dlog(ctx, "(%c,%c)\n", hi, lo);
>>>>      else
>>>> @@ -497,8 +494,15 @@ static void process_cc608(CCaptionSubContext *ctx,
>>>> int64_t pts, uint8_t hi, uint
>>>>  {
>>>>      if (hi == ctx->prev_cmd[0] && lo == ctx->prev_cmd[1]) {
>>>>          /* ignore redundant command */
>>>> -    } else if ( (hi == 0x10 && (lo >= 0x40 && lo <= 0x5f)) ||
>>>> -              ( (hi >= 0x11 && hi <= 0x17) && (lo >= 0x40 && lo <=
>>>> 0x7f) ) ) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* set prev command */
>>>> +    ctx->prev_cmd[0] = hi;
>>>> +    ctx->prev_cmd[1] = lo;
>>>> +
>>>> +    if ( (hi == 0x10 && (lo >= 0x40 && lo <= 0x5f)) ||
>>>> +       ( (hi >= 0x11 && hi <= 0x17) && (lo >= 0x40 && lo <= 0x7f) ) ) {
>>>>          handle_pac(ctx, hi, lo);
>>>>      } else if ( ( hi == 0x11 && lo >= 0x20 && lo <= 0x2f ) ||
>>>>                  ( hi == 0x17 && lo >= 0x2e && lo <= 0x2f) ) {
>>>> @@ -559,14 +563,11 @@ static void process_cc608(CCaptionSubContext
>>>> *ctx, int64_t pts, uint8_t hi, uint
>>>>      } else if (hi >= 0x20) {
>>>>          /* Standard characters (always in pairs) */
>>>>          handle_char(ctx, hi, lo, pts);
>>>> +        ctx->prev_cmd[0] = ctx->prev_cmd[1] = 0;
>>>>      } else {
>>>>          /* Ignoring all other non data code */
>>>>          ff_dlog(ctx, "Unknown command 0x%hhx 0x%hhx\n", hi, lo);
>>>>      }
>>>> -
>>>> -    /* set prev command */
>>>> -    ctx->prev_cmd[0] = hi;
>>>> -    ctx->prev_cmd[1] = lo;
>>>>  }
>>>>
>>>>  static int decode(AVCodecContext *avctx, void *data, int *got_sub,
>>>> AVPacket *avpkt)
>>>> --
>>>> 2.5.3
>>>>
>>>>
>>> This commit seems to break some US broadcast CC decoding. (I can provide
>>> a 15MB sample file if needed)
>>>
>>> Before this commit:
>>> ffmpeg -f lavfi -i "movie=IhedxzUUxNo.ts[out0+subcc]" -map s "ts.srt"
>>> cat ts.srt
>>> 1
>>> 00:00:01,035 --> 00:00:02,035
>>> <font face="Monospace">FAR-RANGING IMPACT PLACES IN THE</font>
>>>
>>> 2
>>> 00:00:02,036 --> 00:00:04,466
>>> <font face="Monospace">FAR-RANGING IMPACT PLACES IN THE
>>> CHURCH AND AT THE LEVEL OF</font>
>>>
>>> 3
>>> 00:00:04,471 --> 00:00:06,811
>>> <font face="Monospace">CHURCH AND AT THE LEVEL OF
>>> POLICY ALL ACROSS THE GLOBE.</font>
>>>
>>> 4
>>> 00:00:06,807 --> 00:00:08,537
>>> <font face="Monospace">POLICY ALL ACROSS THE GLOBE.
>>> CERTAINLY IN THE AREA OF</font>
>>>
>>> 5
>>> 00:00:08,542 --> 00:00:10,382
>>> <font face="Monospace">CERTAINLY IN THE AREA OF
>>> PASTORAL CARE, I HOPE THAT IT</font>
>>>
>>> 6
>>> 00:00:10,377 --> 00:00:14,677
>>> <font face="Monospace">PASTORAL CARE, I HOPE THAT IT
>>> WILL LEAD TO LESS DOGMATICTI</font>
>>>
>>> 7
>>> 00:00:14,682 --> 00:00:16,352
>>> <font face="Monospace">WILL LEAD TO LESS DOGMATICTI
>>> INTERACTION WITH PEOPLE ACROSS A</font>
>>>
>>> 8
>>> 00:00:16,350 --> 00:00:16,920
>>> <font face="Monospace">INTERACTION WITH PEOPLE ACROSS A
>>> THE BOARD.</font>
>>>
>>> 9
>>> 00:00:16,917 --> 00:00:18,287
>>> <font face="Monospace">THE BOARD.
>>> I HOPE THAT, YOU KNOW, THERE'S</font>
>>>
>>> 10
>>> 00:00:18,285 --> 00:00:20,785
>>> <font face="Monospace">I HOPE THAT, YOU KNOW, THERE'S
>>> MORE OF A SENSE THAT THE CHURCH</font>
>>>
>>>
>>>
>>> After that commit,
>>> cat ts.srt
>>> 1
>>> 00:00:01,035 --> 00:00:02,035
>>> <font face="Monospace">FFFARARAR-R-R-RANANANGIGIGINGNGN</font>
>>>
>>> 2
>>> 00:00:02,036 --> 00:00:04,466
>>> <font face="Monospace">FFFARARAR-R-R-RANANANGIGIGINGNGN
>>> CCCHUHUHURCRCRCHHH A A ANNND D D</font>
>>>
>>> 3
>>> 00:00:04,471 --> 00:00:06,811
>>> <font face="Monospace">CCCHUHUHURCRCRCHHH A A ANNND D D
>>> POPOPOLILILICYCYCY A A ALLLL L L</font>
>>>
>>> 4
>>> 00:00:06,807 --> 00:00:08,537
>>> <font face="Monospace">POPOPOLILILICYCYCY A A ALLLL L L
>>> CCCERERERTATATAINININLYLYLY I I </font>
>>>
>>> 5
>>> 00:00:08,542 --> 00:00:10,382
>>> <font face="Monospace">CCCERERERTATATAINININLYLYLY I I
>>> PAPAPASSSTOTOTORARARALLL C C CAA</font>
>>>
>>> 6
>>> 00:00:10,377 --> 00:00:14,677
>>> <font face="Monospace">PAPAPASSSTOTOTORARARALLL C C CAA
>>> WIWIWILLLLLL   LELELEADADAD   TO</font>
>>>
>>> 7
>>> 00:00:14,682 --> 00:00:16,352
>>> <font face="Monospace">WIWIWILLLLLL   LELELEADADAD   TO
>>> INININTETETERARARACTCTCTIOIOION </font>
>>>
>>> 8
>>> 00:00:16,350 --> 00:00:16,920
>>> <font face="Monospace">INININTETETERARARACTCTCTIOIOION
>>> THTHTHE E E BOBOBOAAARDRDRD...</font>
>>>
>>> 9
>>> 00:00:16,917 --> 00:00:18,287
>>> <font face="Monospace">THTHTHE E E BOBOBOAAARDRDRD...
>>> I I I HHHOPOPOPEEE T T THHHATATA</font>
>>>
>>> 10
>>> 00:00:18,285 --> 00:00:20,785
>>> <font face="Monospace">I I I HHHOPOPOPEEE T T THHHATATA
>>> MOMOMORRRE E E OFOFOF A A A   SE</font>
>>>
>>>
>>>
>>>> _______________________________________________
>>>> ffmpeg-devel mailing list
>>>> ffmpeg-devel at ffmpeg.org
>>>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>>>
>>>
>>>
>


More information about the ffmpeg-devel mailing list