[FFmpeg-devel] [PATCH] lavfi: add opencl tonemap filter.

Wed May 23 08:47:53 EEST 2018

> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of
> Niklas Haas
> Sent: Tuesday, May 22, 2018 8:54 PM
> To: Song, Ruiling <ruiling.song at intel.com>
> Cc: Mark Thompson <sw at jkqxz.net>; FFmpeg development discussions and
> patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add opencl tonemap filter.
> 
> On Tue, 22 May 2018 08:56:37 +0000, "Song, Ruiling" <ruiling.song at intel.com>
> wrote:
> > Yes, your idea sounds reasonable. But it may need much effort to re-structure
> the code to make it (that would launch two kernels, and we may need a wait
> between them) and evaluate the performance.
> 
> Actually, a brute force solution to solve the missing peak problem would
> be to filter the first frame twice and discard the first result. (After
> that, you only need to filter each frame once, so the overall
> performance characteristic is unchanged for videos)
> 
> That requires minimal code change, and it still allows it to work for
> single-frame video sources. It also prevents an initial flash of the
> wrong brightness level for transcoded videos.
For the single frame video, do you mean still image?
I am not sure whether current OpenCL acceleration well designed for that?
My feeling is that people mainly use OpenCL for video acceleration,
esp. interop with hardware-accelerated codecs. Welcome to correct me on this.

For the very first frame, I think it is not easy to notice a flash.
Because a default peak value was used for the first frame which is 100 for PQ,
we would just get the first frame a little dimmer.

> 
> Also, performnace wise, I'm not sure how this works in OpenCL land, but
> in OpenGL/Vulkan, you'd just need to emit a pipeline barrier. That
> allows the kernels to synchronize without having to stall the pipeline
> by doing a CPU wait. (And, in general, you'd need a pipeline barrier
> even if you *are* running glFinish() afterwards - the pipeline barrier
> isn't just for timing, it's also for flushing the appropriate caches. In
> general, write visibility on storage buffers requires a pipeline
> barrier. Are you sure this is not the case for OpenCL as well?)
I think it again, the two OpenCL kernel launch needs no wait. It is just two kernel launched from host.
The performance I said is we need to read the image twice, which is obviously not as efficient as read once.
> 
> > Although we are developing offline filter, I think that performance is still very
> important as well as quality.
> > Given that the current implementation does well for video transcoding. I
> would leave it in my TODO list. Sounds ok?
> 
> ACK. It's not my decision, I'm just offering advice.
> 
> > Are you talking about display-referred HLG? I didn't update frame side channel
> data.
> > I am not sure when do I need to update it. I thought all HLG should be scene-
> referred, seems not?
> > Could you tell me more about display-referred HLG?
> 
> There's no such thing as "display-referred HLG". HLG by definition is
> encoded as scene-referred, but the OOTF to convert from scene-referred
> to display-referred is part of the EOTF (also by definition).
> 
> So the HLG EOTF inputs scene-referred and outputs display-referred. When
> you apply the EOTF (including the OOTF) as part of your processing
> chain, you're turning it into a linear light display referred signal.
> The tone mapping then happens on this signal (in display light), and
> then to turn it back to HLG after you're done tone-mapping you apply the
> inverse OOTF + OETF, thus turning it back into scene referred light.
> 
> The HLG OOTF (and therefore the EOTF) is parametrized by the display
> peak. Even though the HLG signal is stored in the range 0.0 - 12.0
> (scene referred), the output range depends on how you tuned the EOTF. If
> you tuned it for the 1000 cd/m^2 reference display, then an input of
> 12.0 will get turned into an output value of 1000 cd/m^2.
> 
> If we then tone-map this to a brightness of 500 cd/m^2, and pass it back
> through the same OOTF, it would get turned into 6.0 rather than the
> 12.0. While this may ultimately reproduce the correct result on-screen
> (assuming the end user of the video file also uses a peak of 1000 cd/m^2
> to decode the file), it's a suboptimal use of the encoding range and
> also not how HLG is designed to operate. (For example, it would affect
> the "SDR backwards compatibility" property of HLG, which is the whole
> reason for the peak-dependent encoding)
> 
> That's why the correct thing to do would be to re-encode the file using
> an inverse OOTF tuned for 500 cd/m², thus taking our tone mapped value
> in question back to the (scene-referred) value of 12.0, and update the
> tagged peak to also read 500 cd/m². Now a spec-conforming implementation
> of a video player (e.g. mpv or VLC) that plays this file would use the
> same tuned EOTF to decode it back to the value of 500 cd/m², thus
> ensuring it round trips correctly.
> 
> > I don't find anything about it. What metadata in HEVC indicate display-referred?
> > Any display-referred HLG video sample?
> 
> As mentioned, the HLG EOTF by definition requires transforming to
> display-referred space. The mastering display metadata *is* what
> describes how this (definitively display-referred) space behaves. So
> when decoding HLG, you use the tagged mastering metadata's peak as the
> parametrization for the EOTF. (This is what e.g. mpv and VLC do)
> 
> For a better explanation of this (admittedly confusing) topic, see Annex
> 1 of ITU-R Recommendation BT.2100.
Excellent explanation. I think I get your idea. Will refine the code per your suggestion.
But still some question, will people/tools tend to fill in the mastering information for HLG video?
I currently see no document that recommend to fill the mastering display for HLG.
I only have one HLG sample download from 4kmedia.org. seems it has no mastering metadata.
Do you have any more HLG videos that show it will often be filled in?
My concern here is will all video players correctly parse the mastering display metadata to decode HLG, or just skip it because most HLG video has no metadata?
As what I do now is tone mapping from HDR to SDR, do you think it is meaningful to add the metadata for SDR video?
And looks like using a peak of 100 in inverse_ootf() when tone-mapping to sdr is just ok?

Thanks again for your kinder advice and suggestion!

Ruiling
> 
> Here is a relevant excerpt: http://0x0.st/se7O.png
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel