[FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter

Song, Ruiling ruiling.song at intel.com
Tue Dec 4 09:31:22 EET 2018



> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, December 3, 2018 8:10 AM
> To: ffmpeg-devel at ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter
> 
> On 28/11/2018 02:27, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song <ruiling.song at intel.com>
> > ---
> >  configure                         |   1 +
> >  libavfilter/Makefile              |   1 +
> >  libavfilter/allfilters.c          |   1 +
> >  libavfilter/opencl/transpose.cl   |  35 +++++
> >  libavfilter/opencl_source.h       |   1 +
> >  libavfilter/transpose.h           |  34 +++++
> >  libavfilter/vf_transpose.c        |  14 +-
> >  libavfilter/vf_transpose_opencl.c | 288
> ++++++++++++++++++++++++++++++++++++++
> >  8 files changed, 362 insertions(+), 13 deletions(-)
> >  create mode 100644 libavfilter/opencl/transpose.cl
> >  create mode 100644 libavfilter/transpose.h
> >  create mode 100644 libavfilter/vf_transpose_opencl.c
> 
> Testing the passthrough option here reveals a slightly unfortunate interaction
> with mapping - if this is the only filter in use, then not doing a redundant copy
> can fall over.
> 
> For example, on Rockchip (Mali) decoding with rkmpp then using:
> 
> -vf
> hwmap=derive_device=opencl,transpose_opencl=dir=clock:passthrough=landsc
> ape,hwdownload,format=nv12
> 
> fails at the download in the passthrough case because it doesn't allow the read
> (the extension does explicitly document this constraint -
> <https://www.khronos.org/registry/OpenCL/extensions/arm/cl_arm_import_m
> emory.txt>).
> 
> VAAPI has a similar problem with a decode followed by:
> 
> -vf
> hwmap=derive_device=opencl,transpose_opencl,hwmap=derive_device=vaapi:r
> everse=1
> 
> because the reverse mapping tries to replace the inlink hw_frames_ctx in a way
> which doesn't actually work.
> 
> All of these cases do of course work if anything else is in the way - any additional
> opencl filter on either side makes it work.  I think it's fine to ignore this (after all,
> the hwmap immediately followed by hwdownload case can already fail in the
> same way), but any thoughts you have on making that better are welcome.
I also noticed that when I did testing. Currently have no idea on how to fix it.
But I do have interest to look for a better fix for this issue.
Right now I am still struggling to understand the source code of hwmap.
I didn't figure out how the hwmap will be used to map from software to hardware format.
That is the piece of code starting from line 200 in vf_hwmap.c
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_hwmap.c#L200
Could you show me some example command that would go into this branch?

Thanks!
Ruiling
> 
> 
> >> Does the dependency on dir have any effect on speed here?  Any call is only
> ever
> >> going to use one side of each of the dir cases, so it feels like it might be nicer
> to
> >> hard-code that so they aren't included in the compiled code at all.
> > For such memory bound OpenCL kernel, some little more arithmetic operation
> would not affect the overall performance.
> > I did some more testing, and see no obvious performance difference for
> different 'dir' parameter. So I just keep it as now.
> 
> That makes sense, thank you for checking.
> 
> 
> So, LGTM and applied.
> 
> Thanks,
> 
> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


More information about the ffmpeg-devel mailing list