[FFmpeg-trac] #9637(ffmpeg:new): Color matrix behaviour in colr box

FFmpeg trac at avcodec.org
Thu Feb 10 17:05:11 EET 2022


#9637: Color matrix behaviour in colr box
-------------------------------------+-------------------------------------
             Reporter:  Ulysse       |                     Type:  defect
  Dansin                             |
               Status:  new          |                 Priority:  normal
            Component:  ffmpeg       |                  Version:  git-
             Keywords:  x265 hdr     |  master
  color_matrix colr_box              |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Summary:

 When encoding an MP4 HDR10 YUV file having color matrix=BT2020c both in
 the colr box and in the NAL_SPS with x265 to another MP4 file with
 BT2020nc color matrix, the output color matrix information in the colr box
 is not as we expect.

 Here are details of the input:

 {{{
 >> mediainfo input.mp4
 General
 Complete name                            : input.mp4
 Format                                   : MPEG-4
 Format profile                           : Base Media
 Codec ID                                 : isom (isom/iso2/mp41)
 File size                                : 40.8 MiB
 Duration                                 : 5 s 131 ms
 Overall bit rate                         : 66.8 Mb/s
 Writing application                      : Lavf59.4.101

 Video
 ID                                       : 1
 Format                                   : HEVC
 Format/Info                              : High Efficiency Video Coding
 Format profile                           : Main 10 at L5@High
 HDR format                               : SMPTE ST 2086, HDR10 compatible
 Codec ID                                 : hev1
 Codec ID/Info                            : High Efficiency Video Coding
 Duration                                 : 5 s 131 ms
 Source duration                          : 6 s 798 ms
 Bit rate                                 : 50.4 Mb/s
 Width                                    : 3 840 pixels
 Height                                   : 2 076 pixels
 Display aspect ratio                     : 1.85:1
 Frame rate mode                          : Constant
 Frame rate                               : 23.976 (23976/1000) FPS
 Color space                              : YUV
 Chroma subsampling                       : 4:2:0
 Bit depth                                : 10 bits
 Scan type                                : Progressive
 Bits/(Pixel*Frame)                       : 0.264
 Stream size                              : 30.9 MiB (76%)
 Source stream size                       : 40.8 MiB (100%)
 Writing library                          : x265 3.1:[Linux][GCC 8.3.0][64
 bit] 10bit
 Encoding settings                        : cpuid=1111039 / frame-threads=2
 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1
 / input-res=3840x2076 / interlace=0 / total-frames=0 / level-idc=0 / high-
 tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers /
 annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop /
 min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 /
 b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 /
 scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-
 size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-
 depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-ssim-rd /
 signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra /
 strong-intra-smoothing / max-merge=2 / limit-refs=3 / no-limit-modes /
 me=1 / subme=2 / merange=57 / temporal-mvp / weightp / no-weightb / no-
 analyze-src-pics / deblock=0:0 / sao / no-sao-non-deblock / rd=2 / no-
 early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless /
 no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 /
 no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 /
 qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / vbv-maxrate=50000 /
 vbv-bufsize=250000 / vbv-init=0.9 / crf-max=0.0 / crf-min=0.0 /
 ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00 / cutree /
 zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 /
 qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 /
 colorprim=9 / transfer=16 / colormatrix=10 / chromaloc=0 / display-
 window=0 / master-
 display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,50)cll=987,137
 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-
 hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-
 multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-
 motion / hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei /
 analysis-reuse-level=5 / scale-factor=0 / refine-intra=0 / refine-inter=0
 / refine-mv=0 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-
 lowpass-dct / refine-analysis-type=-1627389952 / copy-pic=1 / max-ausize-
 factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-
 field / qp-adaptation-range=1.00
 Color range                              : Limited
 Color primaries                          : BT.2020
 Transfer characteristics                 : PQ
 Matrix coefficients                      : BT.2020 constant
 Mastering display color primaries        : Display P3
 Mastering display luminance              : min: 0.0050 cd/m2, max: 1000
 cd/m2
 Maximum Content Light Level              : 987 cd/m2
 Maximum Frame-Average Light Level        : 137 cd/m2
 mdhd_Duration                            : 5130
 Codec configuration box                  : hvcC
 }}}

 If we inspect the colr box with hexedit:

 {{{
         63  6F 6C 72 6E  63 6C 78 00  09 00 10 00  0A
 .....fiel......colrnclx.
 }}}

 The input color matrix is 0A -> 10 -> BT2020c

 We encode using x265 and specify in the encoding params
 colormatrix=bt2020nc.

 {{{
 % ffmpeg -i input.mp4 -filter_complex "scale=3840x1604,setsar=1/1" -c:v
 libx265 -preset fast -x265-params "crf=15:vbv-maxrate=50000:vbv-
 bufsize=250000:colorprim=bt2020:transfer=smpte2084:colormatrix=bt2020nc
 :master-
 display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50
 ):max-cll=1261,512:ref=2" -pix_fmt yuv420p10le -vsync 1 -an -map_chapters
 -1 -map_metadata:g -1:g -map_metadata:s:v -1:g -map_metadata:s:a -1:g
 -movflags +faststart -f mp4 out.mp4
 }}}


 The output is the following:

 {{{
 >> mediainfo out.mp4
 General
 Complete name                            : out.mp4
 Format                                   : MPEG-4
 Format profile                           : Base Media
 Codec ID                                 : isom (isom/iso2/mp41)
 File size                                : 12.1 MiB
 Duration                                 : 5 s 131 ms
 Overall bit rate                         : 19.7 Mb/s
 Writing application                      : Lavf59.4.101

 Video
 ID                                       : 1
 Format                                   : HEVC
 Format/Info                              : High Efficiency Video Coding
 Format profile                           : Main 10 at L5@High
 HDR format                               : SMPTE ST 2086, HDR10 compatible
 Codec ID                                 : hev1
 Codec ID/Info                            : High Efficiency Video Coding
 Duration                                 : 5 s 131 ms
 Bit rate                                 : 19.7 Mb/s
 Width                                    : 3 840 pixels
 Height                                   : 1 604 pixels
 Display aspect ratio                     : 2.40:1
 Frame rate mode                          : Constant
 Frame rate                               : 23.976 (23976/1000) FPS
 Color space                              : YUV
 Chroma subsampling                       : 4:2:0
 Bit depth                                : 10 bits
 Scan type                                : Progressive
 Bits/(Pixel*Frame)                       : 0.134
 Stream size                              : 12.1 MiB (100%)
 Writing library                          : x265 3.4+31-6722fce1f:[Mac OS
 X][clang 12.0.0][64 bit] 10bit
 Encoding settings                        : cpuid=1111039 / frame-threads=3
 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1
 / input-res=3840x1604 / interlace=0 / total-frames=0 / level-idc=0 / high-
 tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers /
 annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop /
 min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 /
 b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 /
 scenecut=40 / hist-scenecut=0 / radl=0 / no-splice / no-intra-refresh /
 ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-
 depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 /
 no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-
 constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 /
 no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / no-frame-dup
 / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao
 / no-sao-non-deblock / rd=2 / selective-sao=4 / no-early-skip / rskip /
 fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip
 / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / no-lossless /
 cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 / qcomp=0.60 / qpstep=4 /
 stats-write=0 / stats-read=0 / vbv-maxrate=50000 / vbv-bufsize=250000 /
 vbv-init=0.9 / min-vbv-fullness=50.0 / max-vbv-fullness=80.0 / crf-max=0.0
 / crf-min=0.0 / ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00
 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain /
 qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 /
 range=0 / colorprim=9 / transfer=16 / colormatrix=9 / chromaloc=0 /
 display-window=0 / master-
 display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50)
 / cll=1261,512 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-
 timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-
 length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / hist-
 threshold=0.03 / no-opt-cu-delta-qp / no-aq-motion / hdr10 / no-hdr10-opt
 / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-
 save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 /
 refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 /
 no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 /
 copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei /
 no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-
 aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-
 rate=0 / no-vbv-live-multi-pass
 Color range                              : Limited
 Color primaries                          : BT.2020
 Transfer characteristics                 : PQ
 Matrix coefficients                      : BT.2020 constant
 matrix_coefficients_Original             : BT.2020 non-constant
 Mastering display color primaries        : Display P3
 Mastering display luminance              : min: 0.0050 cd/m2, max: 4000
 cd/m2
 Maximum Content Light Level              : 1261 cd/m2
 Maximum Frame-Average Light Level        : 512 cd/m2
 Codec configuration box                  : hvcC
 }}}

 If we check the codec NAL_SPS, we can see that matrix coeffs is 9 as
 expected (see the screenshot
 [[Image(https://rak.box.com/s/5sg0oq5d6llub5iv2etuzjv4chmikv0l)]]).

 However, the mp4 colr box is still 10 (bt2020c):
 {{{
         63 6F 6C 72  6E 63 6C 78  00 09 00  10 00 0A     .....colrnclx....
 }}}

 We were expecting the colr box to have been modified with the same
 information as the codec.

 In addition, we verified that this is indeed the behaviour when performing
 a direct codec copy.
 For example, if we take out.mp4 which has bt2020nc in the NAL_SPS and
 bt2020c in the colr box and perform a codec copy:
 {{{
 ffmpeg -i out.mp4 -c copy out_1.mp4
 }}}

 We are getting the following mediainfo and colr box:
 {{{
 General
 Complete name                            : out_1.mp4
 Format                                   : MPEG-4
 Format profile                           : Base Media
 Codec ID                                 : isom (isom/iso2/mp41)
 File size                                : 12.1 MiB
 Duration                                 : 5 s 131 ms
 Overall bit rate                         : 19.7 Mb/s
 Writing application                      : Lavf59.4.101

 Video
 ID                                       : 1
 Format                                   : HEVC
 Format/Info                              : High Efficiency Video Coding
 Format profile                           : Main 10 at L5@High
 HDR format                               : SMPTE ST 2086, HDR10 compatible
 Codec ID                                 : hev1
 Codec ID/Info                            : High Efficiency Video Coding
 Duration                                 : 5 s 131 ms
 Bit rate                                 : 19.7 Mb/s
 Width                                    : 3 840 pixels
 Height                                   : 1 604 pixels
 Display aspect ratio                     : 2.40:1
 Frame rate mode                          : Constant
 Frame rate                               : 23.976 (23976/1000) FPS
 Color space                              : YUV
 Chroma subsampling                       : 4:2:0
 Bit depth                                : 10 bits
 Scan type                                : Progressive
 Bits/(Pixel*Frame)                       : 0.134
 Stream size                              : 12.1 MiB (100%)
 Writing library                          : x265 3.4+31-6722fce1f:[Mac OS
 X][clang 12.0.0][64 bit] 10bit
 Encoding settings                        : cpuid=1111039 / frame-threads=3
 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1
 / input-res=3840x1604 / interlace=0 / total-frames=0 / level-idc=0 / high-
 tier=1 / uhd-bd=0 / ref=2 / no-allow-non-conformance / repeat-headers /
 annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop /
 min-keyint=23 / keyint=250 / gop-lookahead=0 / bframes=4 / b-adapt=0 /
 b-pyramid / bframe-bias=0 / rc-lookahead=15 / lookahead-slices=8 /
 scenecut=40 / hist-scenecut=0 / radl=0 / no-splice / no-intra-refresh /
 ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-
 depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 /
 no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-
 constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 /
 no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / no-frame-dup
 / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao
 / no-sao-non-deblock / rd=2 / selective-sao=4 / no-early-skip / rskip /
 fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip
 / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / no-lossless /
 cbqpoffs=0 / crqpoffs=0 / rc=crf / crf=15.0 / qcomp=0.60 / qpstep=4 /
 stats-write=0 / stats-read=0 / vbv-maxrate=50000 / vbv-bufsize=250000 /
 vbv-init=0.9 / min-vbv-fullness=50.0 / max-vbv-fullness=80.0 / crf-max=0.0
 / crf-min=0.0 / ipratio=1.40 / pbratio=1.30 / aq-mode=2 / aq-strength=1.00
 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain /
 qpmax=69 / qpmin=0 / no-const-vbv / sar=1 / overscan=0 / videoformat=5 /
 range=0 / colorprim=9 / transfer=16 / colormatrix=9 / chromaloc=0 /
 display-window=0 / master-
 display=G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(40000000,50)
 / cll=1261,512 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-
 timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-
 length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / hist-
 threshold=0.03 / no-opt-cu-delta-qp / no-aq-motion / hdr10 / no-hdr10-opt
 / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=0 / analysis-
 save-reuse-level=0 / analysis-load-reuse-level=0 / scale-factor=0 /
 refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 /
 no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 /
 copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei /
 no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00 / scenecut-
 aware-qp=0conformance-window-offsets / right=0 / bottom=0 / decoder-max-
 rate=0 / no-vbv-live-multi-pass
 Color range                              : Limited
 Color primaries                          : BT.2020
 Transfer characteristics                 : PQ
 Matrix coefficients                      : BT.2020 non-constant
 Mastering display color primaries        : Display P3
 Mastering display luminance              : min: 0.0050 cd/m2, max: 4000
 cd/m2
 Maximum Content Light Level              : 1261 cd/m2
 Maximum Frame-Average Light Level        : 512 cd/m2
 Codec configuration box                  : hvcC
 }}}
 {{{
 63 6F 6C  72 6E 63 6C  78 00 09 00  10 00 09          colrnclx....
 }}}

 Now the colr box is 9, meaning BT2020nc.

 As you can see, a codec copy is correcting the colr box with the NAL_SPS
 matrix coefficient, whereas an encode does not. Is this the expected
 behavior?

 Thanks so much for your help!

 Ulysse DANSIN & Jordi SOLSONA

 Files:
 https://rak.box.com/s/p14wrweghbg6ar9ttz4pgpexakbv4r9u
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/9637>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker


More information about the FFmpeg-trac mailing list