[FFmpeg-devel] [PATCH v2] avcodec/jpeg2000dec: Add support for placeholder passes, CAP, and CPF markers

Fri Jun 21 03:22:53 EEST 2024

> On Jun 19, 2024, at 17:27, Tomas Hardin <git at haerdin.se> wrote:
> 
> ons 2024-06-19 klockan 05:51 +0000 skrev WATANABE Osamu:
>> First of all, I appreciate your kind review.
>> I'm writing to discuss the changes and would like to hear your
>> feedback on these.
>> 
>> 
>>> On Jun 18, 2024, at 23:20, Tomas Hardin <git at haerdin.se> wrote:
>>> 
>>> 
>>> It seems this patch combines a lot of things that might be better
>>> to
>>> split into separate patches for easier review
>> 
>> Agree. I will split this patch into several patches.
>> For example, the set of patches includes changes:
>> - only for HTJ2K (JPEG 2000 Part 15)
>> - only for J2K (JPEG 2000 Part 1)
>> - for both J2K and HTJ2K.
>> 
>> Do you think it makes sense?
> 
> Maybe. Going by the commit message, separate support for placeholder
> passes from CAP from CFP handling perhaps?

Support for placeholder passes is related to both J2K and HTJ2K.
The handling of CAP and CPF is related to only HTJ2K.
I will separate the commit into the above three types.

> 
>>>> @@ -382,6 +391,9 @@ static int get_siz(Jpeg2000DecoderContext *s)
>>>>          } else if (ncomponents == 1 && s->precision == 8) {
>>>>              s->avctx->pix_fmt = AV_PIX_FMT_GRAY8;
>>>>              i = 0;
>>>> +        } else if (ncomponents == 1 && s->precision == 12) {
>>>> +            s->avctx->pix_fmt = AV_PIX_FMT_GRAY16LE;
>>>> +            i = 0;
>>> 
>>> Could we handle 9 <= precision <= 16 while we're at it?
>>> 
>> 
>> Yes. The native J2K decoder for both lossless and lossy J2K/HTJ2K
>> codestreams can handle bpp from 9 to 16. This change is required to
>> produce the decoded images for the ISO test codestreams defined in
>> Part 4 (Conformance testing).
> 
> Are there any test files for the other precisions?
> 

In ISO test files, the maximum precision is 12 bits.
I think it would be good to add such a file to FATE (in another patch.)

> 
>>>> @@ -802,6 +881,15 @@ static int read_crg(Jpeg2000DecoderContext
>>>> *s,
>>>> int n)
>>>>      bytestream2_skip(&s->g, n - 2);
>>>>      return 0;
>>>>  }
>>>> +
>>>> +static int read_cpf(Jpeg2000DecoderContext *s, int n)
>>>> +{
>>>> +    if (bytestream2_get_bytes_left(&s->g) < (n - 2))
>>>> +        return AVERROR_INVALIDDATA;
>>>> +    bytestream2_skip(&s->g, n - 2);
>>>> +    return 0;
>>>> +}
>>> 
>>> Don't we already have code for skipping markers we don't care
>>> about?
>>> 
>> 
>> The `read_cpf()` function was added for consistency with the
>> `read_crg()` function.
>> We already have `bytestream2_skip(GetByteContext *g, unsigned int
>> size)` that skips `size`
>> bytes from the compressed data. 
>> Do you think it is better to replace those functions (= `read_cpf()`
>> and `read_crg()`)
>> with `bytestream2_skip()`?
> 
> read_crg() performs a sanity check on ncomponents so it's not quite the
> same. On the other hand this always checks the length of the marker
> unlike the main parsing loop which only does so if
> strict_std_compliance >= FF_COMPLIANCE_STRICT. I guess keeping
> read_cpf() in the patch is fine and is useful for the future

I agree. I will keep `read_cpf()` as is. 

> 
>>>>  /* Tile-part lengths: see ISO 15444-1:2002, section A.7.1
>>>>   * Used to know the number of tile parts and lengths.
>>>>   * There may be multiple TLMs in the header.
>>>> @@ -965,6 +1053,10 @@ static int init_tile(Jpeg2000DecoderContext
>>>> *s,
>>>> int tileno)
>>>>              comp->roi_shift = s->roi_shift[compno];
>>>>          if (!codsty->init)
>>>>              return AVERROR_INVALIDDATA;
>>>> +        if (s->isHT && (!s->Ccap15_b05) && (!codsty->transform))
>>>> +            av_log(s->avctx, AV_LOG_WARNING, "Transformation = 0
>>>> (lossy DWT) is found in HTREV HT set\n");
>>>> +        if (s->isHT && s->Ccap15_b14_15 != (codsty->cblk_style
>>>>>> 6)
>>>> && s->Ccap15_b14_15 != HTJ2K_HTONLY)
>>>> +            av_log(s->avctx, AV_LOG_WARNING, "SPcod/SPcoc value
>>>> does
>>>> not match bit 14-15 values of Ccap15\n");
>>> 
>>> Do you have samples demonstrating the need to accept such broken
>>> files?
>>> If not then we should probably error out
>> 
>> Does `error out` mean that
>> - Should we exit decoding here?
>> - or should we replace AV_LOG_WARNING with AV_LOG_ERROR?
> 
> Returning with AVERROR_INVALIDDATA

OK. I will change those AV_LOG_WARNING into AV_LOG_ERROR.

> 
>>>> @@ -2187,22 +2472,42 @@ static int
>>>> jpeg2000_read_main_headers(Jpeg2000DecoderContext *s)
>>>>              if (!s->tile)
>>>>                  s->numXtiles = s->numYtiles = 0;
>>>>              break;
>>>> +        case JPEG2000_CAP:
>>>> +            if (!s->ncomponents) {
>>>> +                av_log(s->avctx, AV_LOG_WARNING, "CAP marker
>>>> segment
>>>> shall come after SIZ\n");
>>> 
>>> SHALL -> we should be able to safely reject. Similarly with the
>>> other
>>> errors. Unless we know of an encoder that produces broken files
>>> then
>>> there's no reason to be lenient. And if such a broken encoder
>>> exists we
>>> could try to get it fixed
>> 
>> Does `safely reject` mean that we should replace AV_LOG_WARNING with
>> AV_LOG_ERROR? or we should stop decoding here?
> 
> Returning AVERROR_INVALIDDATA since the input is invalid per the spec.
> Else we invite having to deal with incredibly broken encoders.

I agree. Will do.

> 
>>>> @@ -112,6 +112,13 @@ typedef struct Jpeg2000DecoderContext {
>>>>      Jpeg2000Tile    *tile;
>>>>      Jpeg2000DSPContext dsp;
>>>>  
>>>> +    uint8_t         isHT; // HTJ2K?
>>> 
>>> Isn't it possible to mix Part 1 and HT in the same file? I know
>>> HTONLY
>>> exists also
>> 
>> It is possible to mix Part 1 and HT in the same tile-component.
>> This mode is defined as MIXED in the spec of Part 15.
>> There are three types of HT codestreams:
>> - HTONLY
>> - HTDECLARED
>> - MIXED
>> 
>> The spec says - 
>> "The HTONLY set is the set of HTJ2K codestreams where all code-blocks
>> are HT code-blocks. The HTDECLARED set is the set of HTJ2K
>> codestreams
>> where all code-blocks within a given tile-component are either
>> a) HT code-blocks, or b) code-blocks as specified in 
>> Rec. ITU-T T.800 | ISO/IEC 15444-1. The MIXED set is the set of all
>> HTJ2K codestreams that are not in the HTDECLARED set."
> 
> So HTDECLARED allows encoding for example luma as HT and chroma as Part
> 1?

Yes. MIXED is a more relaxed condition that allows the mixing of HT
and non-HT code blocks in the same component.

> 
>>>> @@ -406,6 +420,7 @@ static void recover_mag_sgn(StateVars
>>>> *mag_sgn,
>>>> uint8_t pos, uint16_t q, int32_t
>>>>              E[n] = 32 - ff_clz(v[pos][i] | 1);
>>>>              mu_n[n] = (v[pos][i] >> 1) + 1;
>>>>              mu_n[n] <<= pLSB;
>>>> +            mu_n[n] |= (1 << (pLSB - 1)); // Add 0.5
>>>> (reconstruction
>>>> parameter = 1/2)
>>> 
>>> Aren't there conformance files to catch these kinds of errors? I
>>> remember looked at J2K a while back, and I think we should add such
>>> files to FATE
>>> 
>> 
>> Do you mean "errors" are the difference in pixel values between
>> uncompressed and lossy compressed images?
> 
> Yes
> 
>> There are no specific conformance files to catch the difference
>> in reconstruction parameter values.
> 
> Bummer
> 
>> The value of the reconstruction parameter r is not limited to 1/2.
>> We can use other values.
>> However, the spec of Part 4 says the use of reconstruction parameter
>> r = 1/2 will typically increase the ease of passing.
> 
> I see. Does this only applies to lossy J2K? I can't imagine "lossless"
> and "different reconstruction" are compatible
> 

No. In J2K Part 1, truncation of coding passes in the losslessly encoded
codestream may happen. In this case, this also applies to lossless J2K.
The current implementation is taking care of this case.

Best,
Osamu

> /Tomas
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".