[FFmpeg-devel] [PATCH 1/2] avfilter/vf_zscale: add slice threading

Pavel Koshevoy pkoshevoy at gmail.com
Fri May 31 23:26:51 EEST 2019


On Fri, May 31, 2019 at 2:17 PM Paul B Mahol <onemda at gmail.com> wrote:
>
> On 5/31/19, Pavel Koshevoy <pkoshevoy at gmail.com> wrote:
> > On Fri, May 31, 2019 at 2:03 PM Paul B Mahol <onemda at gmail.com> wrote:
> >>
> >> On 5/31/19, Pavel Koshevoy <pkoshevoy at gmail.com> wrote:
> >> > On Fri, May 31, 2019 at 1:44 PM Pavel Koshevoy <pkoshevoy at gmail.com>
> >> > wrote:

<snip>


> >> >> I've had to use zscale to convert 10-bit 4k60p video from HLG HDR to
> >> >> SDR
> >> >> (bt709).   It was ~36x times slower than real time.  What I ended up
> >> >> doing
> >> >> to speed it up was to generate CLUT image (16-bit yuv444 65x65x65
> >> >> sampling
> >> >> of input color space), lay it out as a 2D image (512x537), and run it
> >> >> through zscale to generate the HDR->SDR transform CLUT.  Then I used
> >> >> the
> >> >> CLUT instead of zscale for every frame...  that got me to about ~3.5x
> >> >> times slower than realtime converting 60fps 10-bit 4k HLG to SDR  (and
> >> >> I
> >> >> don't know any assembly, so I didn't attempt to optimize the CLUT
> >> >> trilinear optimization with SIMD, so maybe it could be faster still).
> >> >> I
> >> >> then ported to CUDA and was able to convert 4k60p HLG->SDR faster than
> >> >> realtime on a Pascal GPU.
> >> >>
> >> >
> >> > I meant trilinear interpolation
> >> >
> >> >
> >> >> So, I'm not sure that adding slice threading to zscale is the best
> >> >> optimization for it.  I think capturing the effect of zscale in a CLUT
> >> >> would be a more significant optimization.
> >> >>
> >> >> Just my 2 cents, hope this helps.
> >>
> >> Your logic is completely flawed.
> >> You can not rescale images with LUT tables.
> >
> >
> > I was not resizing the image from 4K to 1080p ... the output was till
> > 4K.  I was converting from 10-bit in whatever HDR input colorspace
> > (HLG, or HDR10), to 8-bit SDR output colorspace.  You most definitely
> > can approximate that transformation with a CLUT.
> >
>
> Seen lut3d filter?

lut3d works with RGB images, my input and output are all YUV  (P010 actually)
also, lut3d requires a file parameter, not great for my use case.  I
could generate a CLUT with zscale and dump it to disk so I could
initialize lut3d with it, but I hope you see how inconvenient that is
from API view point.


> > Since zscale is capable of resizing and colorspace conversion --
> > perhaps this functionality should be split into separate filters so
> > each can be otpimized differently.
>
> You logic is completely flawed yet again.
> zscale is wrapper around another library.

I know, zimg, C++11.


More information about the ffmpeg-devel mailing list