[FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension for NV12 and P010 textures - split planes

Mark Thompson sw at jkqxz.net
Wed Nov 28 02:04:40 EET 2018

On 26/11/2018 15:32, Mironov, Mikhail wrote:
> You assume that device ID returned from regular enumeration is the same as device ID returned from clGetDeviceIDsFromD3D11KHR. It is not guaranteed and I didn't try this.

Ok, that's fair I suppose.  Fixing it hasn't changed anything, though.

> Also I would add tracing to ensure that CL_CONTEXT_D3D11_DEVICE_KHR is actually set and clGetDeviceIDsFromD3D11KHR is called.

The first was always true (Intel requires it too), the second is now.

> In AMF code I always set CL_CONTEXT_INTEROP_USER_SYNC to true.

I'm not completely sure of the precise semantics of D3D11, but I don't think we want that here - the clEnqueueAcquireD3D11ObjectsKHR() call should be the first synchronisation point following the previous component (generally a decoder).

I tried setting it anyway, but the behaviour doesn't change - I still get CL_INVALID_D3D11_RESOURCE_KHR.

> Also I would trace other parameters to clCreateFromD3D11Texture2DKHR: memory flags and texture descriptor.

For the flags, I tried all of CL_MEM_READ_WRITE / CL_MEM_WRITE_ONLY / CL_MEM_READ_ONLY, which are the only allowed values.  The texture descriptor is just the one created for the decoder.

> BTW: does the interop work for NV or Intel?

The D3D11 interop works on Intel, though not directly in the ffmpeg utility without a little change because it requires the textures to be created with D3D11_RESOURCE_MISC_FLAG (as described in the extension document <https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_d3d11_nv12_media_sharing.txt>).  Intel doesn't care whether clGetDeviceIDsFromD3D11KHR() is used or not (though I've fixed that anyway), but it does require the CL_CONTEXT_D3D11_DEVICE_KHR option to clCreateContext() (fails with CL_INVALID_CONTEXT if it isn't).

It can't work on Nvidia because they don't offer any way to share NV12 textures.

The DXVA2/D3D9 interop works correctly on both AMD and Intel with only the common standard extension.  The one Nvidia device I can find easily doesn't have cl_khr_dx9_media_sharing at all, so that doesn't work.


- Mark

>> -----Original Message-----
>> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
>> Mark Thompson
>> Sent: November 25, 2018 5:22 PM
>> To: ffmpeg-devel at ffmpeg.org
>> Subject: Re: [FFmpeg-devel] [INFO]AMD D3D11 to OpenCL interop extension
>> for NV12 and P010 textures - split planes
>> On 25/11/2018 21:28, Mironov, Mikhail wrote:
>>> It seem that the failure is not in the new extension but before, in the
>> interop from D3D11 to OCL. It can happen in two cases: OCL device/context
>> are created without D3D11 device or format of the texture is not supported.
>> NV12 is supported. I went through the latest ffmpeg snapshot and found that
>> function opencl_enumerate_d3d11_devices() looks correct, pointer to the
>> function is set to OpenCLDeviceSelector::enumerate_devices member but I
>> cannot find a call to selector->enumerate_devices(). Instead
>> opencl_enumerate_devices() is called directly. So my guess is that created
>> OCL device is not created from D3D11.
>> Hmm, right - patch just sent to fix the selection call.
>> It doesn't actually make any difference to this case, though, since the filter
>> made it choose the right device anyway and
>> CL_CONTEXT_D3D11_DEVICE_KHR was always set when deriving from
>> D3D11.  (It could only have made a difference if there were other conflicting
>> D3D11 devices it could have picked incorrectly.)
>>> Just in case OCL device creation sample:
>>> https://github.com/GPUOpen-
>> LibrariesAndSDKs/AMF/blob/master/amf/public
>>> /samples/CPPSamples/common/DeviceOpenCL.cpp
>>> Regarding the new split extension: here is a working snippet:
>>> cl_mem clImage2D = 0;
>>> cl_mem clImages[AMF_SURFACE_MAX_PLANES]; // index can be not 0 if
>>> texture is allocated as an array.
>>>  clImage2D = clCreateFromD3D11Texture2DKHR(m_clContext, memflags,
>>> pTexture, index, &clStatus);
>> Where is the comment about index being nonzero coming from there?
>> Other callers to this definitely start from a zero index.  (I tried adding one to
>> my index values but it didn't change the result.)
>>>  for(int i = 0; i < planesNumber; i++)
>>>   {
>>>   	clImages[i] = clGetPlaneFromImageAMD(m_clContext, clImage2D,
>>> (cl_uint)i, &clStatus);
>>> }
>>> // don’t forget to release clImages[i] and clImage2D
>> Otherwise, that agrees with how I read the extension document.

More information about the ffmpeg-devel mailing list