[FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter
Song, Ruiling
ruiling.song at intel.com
Tue Dec 4 09:31:22 EET 2018
> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, December 3, 2018 8:10 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter
>
> On 28/11/2018 02:27, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> > ---
> > configure | 1 +
> > libavfilter/Makefile | 1 +
> > libavfilter/allfilters.c | 1 +
> > libavfilter/opencl/transpose.cl | 35 +++++
> > libavfilter/opencl_source.h | 1 +
> > libavfilter/transpose.h | 34 +++++
> > libavfilter/vf_transpose.c | 14 +-
> > libavfilter/vf_transpose_opencl.c | 288
> ++++++++++++++++++++++++++++++++++++++
> > 8 files changed, 362 insertions(+), 13 deletions(-)
> > create mode 100644 libavfilter/opencl/transpose.cl
> > create mode 100644 libavfilter/transpose.h
> > create mode 100644 libavfilter/vf_transpose_opencl.c
>
> Testing the passthrough option here reveals a slightly unfortunate interaction
> with mapping - if this is the only filter in use, then not doing a redundant copy
> can fall over.
>
> For example, on Rockchip (Mali) decoding with rkmpp then using:
>
> -vf
> hwmap=derive_device=opencl,transpose_opencl=dir=clock:passthrough=landsc
> ape,hwdownload,format=nv12
>
> fails at the download in the passthrough case because it doesn't allow the read
> (the extension does explicitly document this constraint -
> <https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_import_m
> emory.txt>).
>
> VAAPI has a similar problem with a decode followed by:
>
> -vf
> hwmap=derive_device=opencl,transpose_opencl,hwmap=derive_device=vaapi:r
> everse=1
>
> because the reverse mapping tries to replace the inlink hw_frames_ctx in a way
> which doesn't actually work.
>
> All of these cases do of course work if anything else is in the way - any additional
> opencl filter on either side makes it work. I think it's fine to ignore this (after all,
> the hwmap immediately followed by hwdownload case can already fail in the
> same way), but any thoughts you have on making that better are welcome.
I also noticed that when I did testing. Currently have no idea on how to fix it.
But I do have interest to look for a better fix for this issue.
Right now I am still struggling to understand the source code of hwmap.
I didn't figure out how the hwmap will be used to map from software to hardware format.
That is the piece of code starting from line 200 in vf_hwmap.c
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_hwmap.c#L200
Could you show me some example command that would go into this branch?
Thanks!
Ruiling
>
>
> >> Does the dependency on dir have any effect on speed here? Any call is only
> ever
> >> going to use one side of each of the dir cases, so it feels like it might be nicer
> to
> >> hard-code that so they aren't included in the compiled code at all.
> > For such memory bound OpenCL kernel, some little more arithmetic operation
> would not affect the overall performance.
> > I did some more testing, and see no obvious performance difference for
> different 'dir' parameter. So I just keep it as now.
>
> That makes sense, thank you for checking.
>
>
> So, LGTM and applied.
>
> Thanks,
>
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list