[FFmpeg-devel] [PATCH] avformat/mov: (v4) fix get_eia608_packet

Pavel Koshevoy pkoshevoy at gmail.com
Fri Feb 14 14:11:15 EET 2025


On Thu, Feb 13, 2025, 22:04 Andreas Rheinhardt <
andreas.rheinhardt at outlook.com> wrote:

> Pavel Koshevoy:
> > The problem is reproducible with "Test for Quicktime 608 CC file.mov"
> > from https://samples.ffmpeg.org/MPEG2/subcc/
> >
> > ffmpeg -i "Test for Quicktime 608 CC file.mov" -map 0 -c copy -y
> remuxed.mov
> >
> > Prior to the fix QuickTime Player playback of remuxed.mov would
> > render garbage text for "English CC" subtitles.
>
> Is remuxing necessary for there being garbage?
>
> > ---
> >  libavformat/mov.c | 70 +++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 59 insertions(+), 11 deletions(-)
> >
> > diff --git a/libavformat/mov.c b/libavformat/mov.c
> > index 85aef33b19..5a91ef5b8c 100644
> > --- a/libavformat/mov.c
> > +++ b/libavformat/mov.c
> > @@ -10788,25 +10788,73 @@ static int mov_change_extradata(AVStream *st,
> AVPacket *pkt)
> >      return 0;
> >  }
> >
> > -static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int size)
> > +static int get_eia608_packet(AVIOContext *pb, AVPacket *pkt, int
> src_size)
> >  {
> > -    int new_size, ret;
> > +    /* We can't make assumptions about the structure of the payload,
> > +       because it may include multiple cdat and cdt2 samples. */
> > +    const uint32_t cdat = AV_RB32("cdat");
> > +    const uint32_t cdt2 = AV_RB32("cdt2");
>
> I don't think that using (non-variable) variables for these improves
> clarity (e.g. it means that the definition of the actual values used for
> the comparisons below is now further away from its use). Why not simply
> use MKBETAG('c','d','a','t') below?
>
> > +    int ret, out_size = 0;
> >
> > -    if (size <= 8)
> > +    /* a valid payload must have size, 4cc, and at least 1 byte pair: */
> > +    if (src_size < 10)
> >          return AVERROR_INVALIDDATA;
> > -    new_size = ((size - 8) / 2) * 3;
> > -    ret = av_new_packet(pkt, new_size);
> > +
> > +    /* avoid an int overflow: */
> > +    if ((src_size - 8) / 2 >= INT_MAX / 3)
> > +        return AVERROR_INVALIDDATA;
> > +
> > +    ret = av_new_packet(pkt, ((src_size - 8) / 2) * 3);
> >      if (ret < 0)
> >          return ret;
> >
> > -    avio_skip(pb, 8);
> > -    for (int j = 0; j < new_size; j += 3) {
> > -        pkt->data[j] = 0xFC;
> > -        pkt->data[j+1] = avio_r8(pb);
> > -        pkt->data[j+2] = avio_r8(pb);
> > +    /* parse and re-format the c608 payload in one pass. */
> > +    while (src_size >= 10) {
> > +        const uint32_t atom_size = avio_rb32(pb);
> > +        const uint32_t atom_type = avio_rb32(pb);
> > +        const uint32_t data_size = atom_size - 8;
>
> This may wrap around (if atom_size is < 8). If int is 32 bits, then the
> data_size > src_size check will catch this, but in case of 64 bit ints
> it may not. Relying on (unsigned, defined) integer wraparound should be
> avoided unless it is advantageous to use it; in this case, this is just
> not true: Just compare atom_size to 10 below.
>
> > +        const uint8_t cc_field =
> > +            atom_type == cdat ? 1 :
> > +            atom_type == cdt2 ? 2 :
> > +            0;
> > +
> > +        /* account for bytes consumed for atom size and type. */
> > +        src_size -= 8;
> > +
> > +        /* make sure the data size stays within the buffer boundaries.
> */
> > +        if (data_size < 2 || data_size > src_size) {
> > +            ret = AVERROR_INVALIDDATA;
> > +            break;
> > +        }
> > +
> > +        /* make sure the data size is consistent with N byte pairs. */
> > +        if (data_size % 2 != 0) {
>
> We typically try to avoid redundant "!= 0".
>
> > +            ret = AVERROR_INVALIDDATA;
> > +            break;
> > +        }
> > +
> > +        if (!cc_field) {
> > +            /* neither cdat or cdt2 ... skip it */
> > +            avio_skip(pb, data_size);
> > +            src_size -= data_size;
> > +            continue;
> > +        }
> > +
> > +        for (int32_t i = 0; i < data_size; i += 2) {
>
> int32_t? Why signed? (And why use a separate loop counter at all? Simply
> decrement data_size by 2 in each iteration.
>
> > +            pkt->data[out_size] = (0x1F << 3) | (1 << 2) | (cc_field -
> 1);
> > +            pkt->data[out_size + 1] = avio_r8(pb);
> > +            pkt->data[out_size + 2] = avio_r8(pb);
> > +            out_size += 3;
> > +            src_size -= 2;
> > +        }
> >      }
> >
> > -    return 0;
> > +    if (src_size > 0)
> > +        /* skip any remaining unread portion of the input payload */
> > +        avio_skip(pb, src_size);
> > +
> > +    av_shrink_packet(pkt, out_size);
> > +    return ret;
> >  }
> >
> >  static int mov_finalize_packet(AVFormatContext *s, AVStream *st,
> AVIndexEntry *sample,
>
> Generally, I believe that reading the input into pkt->data[size / 2]
> would be advantageous: It would make it simple to check for EOF and I/O
> errors (notice that the avio_r* reads above are unchecked) and would
> read the data in one go, avoiding all the avio_skip().
>
> - Andreas
>

Then perhaps you would find v2 of the patch more agreeable to your taste,
could you review that instead?

This function has been corrupting closed captions since 2020.  There was a
different fix posted in 2023 (mentioned by Devin in the 1st version of this
patch), perhaps that should be merged instead, as it also solves the
problem.

Pavel.


More information about the ffmpeg-devel mailing list