[FFmpeg-user] AVFrame, AV_NUM_DATA_POINTERS

Mon Sep 28 22:49:00 EEST 2020

On 28/09/2020, Mark Filipak (ffmpeg) <markfilipak at bog.us> wrote:
> On 09/27/2020 03:31 PM, James Darnley wrote:
>> On 27/09/2020, Mark Filipak (ffmpeg) <markfilipak at bog.us> wrote:
>>> 2, Are the width & height indexes in bytes or samples? If bytes, how are
>>> 8-bit v. 10-bit v. 12-bit
>>> pixel formats handled at the index generation end?
>>
>> Width and height are given in pixels.  How that relates to bytes in
>> memory depends on the pixel format.  Different planes can have
>> different sizes like in the extremely common yuv420p.
>
> Ah-ha #1. I think you've answered another question. The planes that ffmpeg
> refers to are the Y, Cb,
> and Cr samples, is that correct?

If the pixel format is a YCbCr format, such as yuv420p, then yes.  If
it matters to you: I am not sure of the exact order of the planes.  It
is probably documented in the pixel format header.

RGB is also available and maybe some other more niche ones.  Oh, alpha
channels too.  Again see the pixel format.

> So, I'm going to make some statements that can be confirmed or refuted --
> making statements rather
> than asking questions is just part of my training, not arrogance. Statements
> are usually clearer.
> I'm trying to nail down the structures for integration into my glossary.
>
> For YCbCr420, 8-bit, 720x576 (for example), the planes are separate and the
> structures are:
> Y: 2-dimensional, 720x576 byte array.
> Cb: 2-dimensional, 180x144 byte array.
> Cr: 2-dimensional, 180x144 byte array.

What do you mean by 2 dimensional?  IMO you should think of the planes
as a single block of memory each.  The first pixel will be the first
byte.  In your example the first plane in a yuv420p picture will be at
least 720*576 bytes long.  The two chroma planes will have 360x288
samples each with their own linesize.  I'm not sure how you got
180x144.  The subsampling is only a factor of 2 for 4:2:0.

The linesize can make it larger than that.  The linesize also says how
many bytes are between the start of a row and the start of the next.

The same color space and subsampling could be expressed in a few
different ways.  Again it is the pixel format which says how the data
is laid out in memory.  You will probably have yuv420p

> Specifically, the decoder's output is not in macroblock format, correct? The
> reason I ask for
> confirmation is that H.262 implies that even raw pictures are in macroblock
> format, improbable as
> that may seem.

An AVFrame might not come from a source that has macroblocks.  I have
no idea what H.262 says.

>>  Byte order
>> (endianess) of larger samples depends on the pixel format (but it is
>> usually native).  The number of bytes used for a sample is given in
>> the pixel format.  The bits are in the low N bits.
>
> Ah-ha #2. I think you've answered yet another question: The arrays are
> bytes, not bits, correct? So,
> going from 8-bit samples to 10-bit samples doubles the sizes of the arrays,
> correct?

You cannot easily address bits in C and ffmpeg doesn't bother with bit
fields.  yuv420p10 will use 16-bit words with the samples in the low
10 bits and the high 6 are zero.  This does have the effect of
doubling the size of the memory buffers.

P.S.   When I say pixel format I mean the specific ffmpeg feature.