[FFmpeg-devel] new packed pixel formats (machine vision)

Wed Oct 9 01:50:54 EEST 2024

On 08/10/2024 21:17, Diederick C. Niehorster wrote:
> Dear Lynne,
> 
> On Tue, Oct 8, 2024 at 1:11 PM Lynne via ffmpeg-devel
> <ffmpeg-devel at ffmpeg.org> wrote:
> 
> Thank you for your quick and helpful answer! However I have several questions.
> 
>> We have support for AV_PIX_FMT_BAYER_RGGB16, since its a common Bayer
>> layout that cinema cameras output, so its definitely within the scope
>> and not some application-specific pixfmt.
>> RGGB10 is just a bitpacked version.
> 
> Good!
> 
>> Unfortunately, we do not directly support 10bit packed pixel formats,
>> since we can't fit them into our definition, as we only support
>> byte-aligned formats.
> 
> Non byte-aligned formats can be represented with
> AV_PIX_FMT_FLAG_BITSTREAM right? I see AV_PIX_FMT_XV30BE as (the only)
> example. I am quite possibly misunderstanding.
> My first example AV_PIX_FMT_BAYER_RGGB10 is byte-aligned by the way,
> but the problem is that the R and B components would have a depth of
> 2.5 bits (10/4) in the scheme that ffmpeg uses, so can't be correctly
> defined. Though i wonder if a rounded value (one up to 3 other down to
> 2) is the solution here, since these are only informative (correct?)
> and 3+5+2=10 so would be correct for this 10bit format.

Nope, AV_PIX_FMT_FLAG_BITSTREAM is for a very special case where all 
components are aligned and repeat on a 32-bits basis.
If using it was an option, we wouldn't have bitpacked_enc or v210enc/dec.

>> We treat those as codecs, for example AV_CODEC_ID_V210 and
>> AV_CODEC_ID_BITPACKED.
>> The format would have to be implemented as a codec, with a decoder that
>> accepts AV_CODEC_ID_RGGB10 and outputs AV_PIX_FMT_RGGB16, setting
>> avctx->bits_per_sample to 10 to indicate how many bits are used.
> 
> Hmm, but how would that work? If i understand correctly, I would
> package the raw image data in AVPackets and use the decoder I'd write
> to turn them into AVFrames, that i can then use as i wish.
> That is a lot more complicated than adding these as pixel formats and
> having swscale support them as an input format, since then I could
> directly package the video data in an AVFrame and benefit from auto
> conversion insertion during format negotiation and feed these new
> pixel formats into anything without needing to special case with the
> extra decoder in between.

That is how it must be. Unless you want to refactor swscale and our 
entire codebase to allow such formats, which would be a lot more work.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA2FEA5F03F034464.asc
Type: application/pgp-keys
Size: 624 bytes
Desc: OpenPGP public key
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241009/3a6985bc/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20241009/3a6985bc/attachment.sig>