[FFmpeg-devel] [RFC]lavc/ffv1dec: Scale msb-packed output to full 16bit

Thu Nov 17 22:53:27 EET 2016

Hi Carl,

On Thu, Nov 17, 2016 at 3:10 PM, Carl Eugen Hoyos <ceffmpeg at gmail.com>
wrote:

> 2016-11-17 19:34 GMT+01:00 Ronald S. Bultje <rsbultje at gmail.com>:
> > Carl, the reason the patch is wrong is that luma range does not scale up
> > from 16<<n to (236<<n)-1, but from 16<<n to (236-1)<<n, where n is bpc-8.
> > This is documented in numerous places, see e.g.
> > https://msdn.microsoft.com/en-us/library/windows/desktop/
> bb970578(v=vs.85).aspx
> > :
> >
> > "For example, if the white point of an 8-bit format is 235, the
> > corresponding 10-bit format has a white point at 940 (235 × 4)."
>
> This is indeed very difficult to understand: How is this related?
>
> AV_PIX_FMT_GRAY was changed years ago after several users
> protested that it did conform to above specification, since it doesn't
> do now, the paragraph looks unrelated.
> (AV_PIX_FMT_GRAY is full-scale as is AV_PIX_FMT_GRAY16)
>
> More important:
> My patch is not related to 10-bit output format, so above calculations
> are also not related afaict.

The relevant field in the decoder is called bits_per_raw_sample. If this is
10, the format is 10-bit. The output format is therefore also 10-bit,
otherwise it's not lossless. How you represent that in an uint16_t is up to
you, obviously, but it's 10bit, right? So the patch relates to 10bit
formats.

The question then seems, what do you do with the ("filler") bits if we
shift them to MSB in gray16, i.e. packed_at_lsb=0? My assumption (from the
decision to represent the coding as gray16 while setting
bits_per_raw_sample) is that you want it to represent a value in MSB of
uint16_t that would be closest to what it would have been if the value was
actually coded as 16bit, right?

(I'm assuming the above is logical and we all agree on it, please let me
know if you disagree, otherwise let's go on to the apparently controversial
stuff.)

> Your patch is therefore theoretically wrong. The correct behaviour in
> > limited-range upscaling is indeed to leave the lower bits empty. For
> > full-range, the lower bits might be filled if the intention is for the
> > pixel value to be a representation of what the 16bit result would look
> > like, but whether this belongs in a decoder or not is up for discussion.
>
> Decoders tend to be used by video players and if white looks gray on a
> screen, it doesn't make much sense to say "the player should have
> worked-around this".

I'm fine with that, but it depends on the format. If the input was
full-range, then yes. If the input was not full-range, then no.

Assume I have a H264 file with a PPS range=limited and a chroma_idc=0
(4:0:0, i.e. grayscale). You agree from the H264 spec that this is a legal
file, right? I could also generate source data by taking any (10bit) Bt709
YUV file (which almost universally use limited-range coding) and stripping
the UV planes out and leaving the Y data untouched.

How do I represent this data losslessly in ffv1 after your patch? I don't
think I can, because a value of 0x3ac (235x4=940, i.e. white) would be
upshifted to (0x3ac << 6) || (0x3ac >> 4) = 0xeb3a, which converting back
(as e.g. swscale does) to 10bits is 940.9, so after rounding it would be
941. That is not 940, thus not lossless and therefore a bug (IMO).

I agree that if the data was fullrange, your patch may be correct (it
depends on some other stuff, but I don't want to get into the
he-said-she-said stuff, I'm describing a bug here), but your patch breaks
something that worked previously.

Ronald