[FFmpeg-devel] [PATCH v3 3/8] lavu: add a Vulkan hwcontext

Rostislav Pehlivanov atomnuker at gmail.com
Thu May 31 13:48:36 EEST 2018


On 27 May 2018 at 18:15, Mark Thompson <sw at jkqxz.net> wrote:

> On 22/05/18 03:46, Rostislav Pehlivanov wrote:
> > This commit adds a Vulkan hwcontext, currently capable of mapping DRM and
> > VAAPI frames but additional functionality can be added later to support
> > importing of D3D11 surfaces as well as exporting to various other APIs.
>
> Have you investigated the D3D11 interop at all?  Seeing that working (even
> if it isn't included here) would be nice to make sure there aren't any
> gotchas later.
>
> > This context requires the newest stable version of the Vulkan API,
> > and once the new extension for DRM surfaces makes it in will also require
> > it (in order to properly and fully import them).
> >
> > It makes use of every part of the Vulkan spec in order to ensure fastest
> > possible uploading, downloading and mapping of frames. On AMD, it will
> > also make use of mapping host memory frames in order to upload
> > very efficiently and with minimal CPU to hardware.
> >
> > To be useful for non-RGB images an implementation with the YUV images
> > extension is needed. All current implementations support that with the
> > exception of AMD, though support is coming soon for Mesa.
>
> Neither AMD nor Intel on Windows seem to support it (vulkaninfo with both:
> <https://0x0.st/s212.txt> (I realise that won't show the relevant
> formats, but it also doesn't work)).
>

Yes, AMD doesn't support it on Windows, Intel does, however, on a 620 and
most others. Maybe you have old drivers?

https://vulkan.gpuinfo.org/displayreport.php?id=3318#extensions



>
> >
> > Signed-off-by: Rostislav Pehlivanov <atomnuker at gmail.com>
> > ---
> >  configure                      |   10 +
> >  doc/APIchanges                 |    3 +
> >  libavutil/Makefile             |    3 +
> >  libavutil/hwcontext.c          |    4 +
> >  libavutil/hwcontext.h          |    1 +
> >  libavutil/hwcontext_internal.h |    1 +
> >  libavutil/hwcontext_vulkan.c   | 2013 ++++++++++++++++++++++++++++++++
> >  libavutil/hwcontext_vulkan.h   |  133 +++
> >  libavutil/pixdesc.c            |    4 +
> >  libavutil/pixfmt.h             |    4 +
> >  libavutil/version.h            |    4 +-
> >  11 files changed, 2178 insertions(+), 2 deletions(-)
> >  create mode 100644 libavutil/hwcontext_vulkan.c
> >  create mode 100644 libavutil/hwcontext_vulkan.h
> >
> > diff --git a/configure b/configure
> > index 09ff0c55e2..5f4407b753 100755
> > --- a/configure
> > +++ b/configure
> > @@ -300,6 +300,7 @@ External library support:
> >    --enable-opengl          enable OpenGL rendering [no]
> >    --enable-openssl         enable openssl, needed for https support
> >                             if gnutls, libtls or mbedtls is not used [no]
> > +  --enable-vulkan          enable Vulkan code [no]
>
> Ordering (and in list below).
>

What do you mean? Its in alphabet order.



> >    --disable-sndio          disable sndio support [autodetect]
> >    --disable-schannel       disable SChannel SSP, needed for TLS support
> on
> >                             Windows if openssl and gnutls are not used
> [autodetect]
> > @@ -1767,6 +1768,7 @@ HWACCEL_LIBRARY_LIST="
> >      mmal
> >      omx
> >      opencl
> > +    vulkan
> >  "
> >
> >  DOCUMENT_LIST="
> > @@ -2223,6 +2225,7 @@ HAVE_LIST="
> >      opencl_dxva2
> >      opencl_vaapi_beignet
> >      opencl_vaapi_intel_media
> > +    vulkan_drm_mod
> >      perl
> >      pod2man
> >      texi2html
> > @@ -6349,6 +6352,13 @@ enabled vdpau &&
> >
> >  enabled crystalhd && check_lib crystalhd "stdint.h
> libcrystalhd/libcrystalhd_if.h" DtsCrystalHDVersion -lcrystalhd
> >
> > +enabled vulkan &&
> > +    require_pkg_config vulkan "vulkan >= 1.1.73" "vulkan/vulkan.h"
> vkCreateInstance
> > +
> > +if enabled_all vulkan libdrm ; then
> > +    check_cpp_condition vulkan_drm_mod vulkan/vulkan.h "defined
> VK_EXT_IMAGE_DRM_FORMAT_MODIFIER_EXTENSION_NAME"
> > +fi
> > +
> >  if enabled x86; then
> >      case $target_os in
> >          mingw32*|mingw64*|win32|win64|linux|cygwin*)
> > diff --git a/doc/APIchanges b/doc/APIchanges
> > index efe15ba4e0..1b37f58ca7 100644
> > --- a/doc/APIchanges
> > +++ b/doc/APIchanges
> > @@ -15,6 +15,9 @@ libavutil:     2017-10-21
> >
> >  API changes, most recent first:
> >
> > +2018-04-xx - xxxxxxxxxx - lavu 56.19.100 - hwcontext.h
> > +  Add AV_HWDEVICE_TYPE_VULKAN and implementation.
>
> This should mention AV_PIX_FMT_VULKAN as well.
>

Fixed.



> > +
> >  2018-05-xx - xxxxxxxxxx - lavf 58.15.100 - avformat.h
> >    Add pmt_version field to AVProgram
> >
> > ...
> > diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
> > new file mode 100644
> > index 0000000000..db0a5b7e61
> > --- /dev/null
> > +++ b/libavutil/hwcontext_vulkan.c
> > @@ -0,0 +1,2013 @@
> > +/*
> > + * Vulkan hwcontext
> > + * Copyright (c) 2018 Rostislav Pehlivanov <atomnuker at gmail.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#include "config.h"
> > +#include "pixdesc.h"
> > +#include "avstring.h"
> > +#include "hwcontext.h"
> > +#include "hwcontext_internal.h"
> > +#include "hwcontext_vulkan.h"
> > +
> > +#if CONFIG_LIBDRM
> > +#include <unistd.h> /* lseek */
> > +#include <xf86drm.h>
> > +#include <drm_fourcc.h>
> > +#include "hwcontext_drm.h"
> > +#if CONFIG_VAAPI
> > +#include <va/va_drmcommon.h>
> > +#include "hwcontext_vaapi.h"
> > +#endif
> > +#endif
> > +
> > +typedef struct VulkanDevicePriv {
> > +    /* Properties */
> > +    VkPhysicalDeviceProperties props;
> > +    VkPhysicalDeviceMemoryProperties mprops;
> > +
> > +    /* Debug callback */
> > +    VkDebugUtilsMessengerEXT debug_ctx;
> > +
> > +    /* Image uploading */
> > +    VkCommandPool cmd_pool;
> > +    VkCommandBuffer cmd_buf;
> > +    VkQueue cmd_queue;
> > +    VkFence cmd_fence;
> > +
> > +    /* Extensions */
> > +    uint64_t extensions;
> > +
> > +    /* Settings */
> > +    int use_linear_images;
> > +    int use_disjoint_images;
> > +} VulkanDevicePriv;
> > +
> > +#define VK_LOAD_PFN(inst, name) PFN_##name pfn_##name = (PFN_##name)
>        \
> > +
> vkGetInstanceProcAddr(inst, #name)
> > +
> > +#define DEFAULT_USAGE_FLAGS (VK_IMAGE_USAGE_SAMPLED_BIT      |
>        \
> > +                             VK_IMAGE_USAGE_STORAGE_BIT      |
>        \
> > +                             VK_IMAGE_USAGE_TRANSFER_SRC_BIT |
>          \
> > +                             VK_IMAGE_USAGE_TRANSFER_DST_BIT)
> > +
> > +#define ADD_VAL_TO_LIST(list, count, val)
>         \
> > +    do {
>        \
> > +        list = av_realloc_array(list, sizeof(*list), ++count);
>        \
> > +        if (!list) {
>        \
> > +            err = AVERROR(ENOMEM);
>        \
> > +            goto end;
>         \
> > +        }
>         \
> > +        list[count - 1] = val;
>        \
> > +    } while(0)
> > +
> > +static const VkFormat vk_format_map[AV_PIX_FMT_NB] = {
> > +    /* Gray */
> > +    [AV_PIX_FMT_GRAY8]     = VK_FORMAT_R8_UNORM,
> > +    [AV_PIX_FMT_GRAY10]    = VK_FORMAT_R10X6_UNORM_PACK16,
> > +    [AV_PIX_FMT_GRAY12]    = VK_FORMAT_R12X4_UNORM_PACK16,
>
> Aren't GRAY10 and GRAY12 packed in the low bits rather than the high bits?
>

Nope:
"VK_FORMAT_R10X6_UNORM_PACK16 specifies a one-component, 16-bit unsigned
normalized format that has a single 10-bit R component in the top 10 bits
of a 16-bit word, with the bottom 6 bits set to 0."
"VK_FORMAT_R12X4_UNORM_PACK16 specifies a one-component, 16-bit unsigned
normalized format that has a single 12-bit R component in the top 12 bits
of a 16-bit word, with the bottom 4 bits set to 0."

We put padding bits in the top so I've removed these 2.



> > +    [AV_PIX_FMT_GRAY16]    = VK_FORMAT_R16_UNORM,
> > +
> > +    /* Interleaved */
> > +    [AV_PIX_FMT_NV12]      = VK_FORMAT_G8_B8R8_2PLANE_420_UNORM,
> > +    [AV_PIX_FMT_P010]      = VK_FORMAT_G10X6_B10X6R10X6_2PL
> ANE_420_UNORM_3PACK16,
> > +    [AV_PIX_FMT_P016]      = VK_FORMAT_G16_B16R16_2PLANE_420_UNORM,
> > +    [AV_PIX_FMT_NV16]      = VK_FORMAT_G16_B16R16_2PLANE_422_UNORM,
> > +    [AV_PIX_FMT_UYVY422]   = VK_FORMAT_B16G16R16G16_422_UNORM,
> > +    [AV_PIX_FMT_YVYU422]   = VK_FORMAT_G16B16G16R16_422_UNORM,
>
> This should be AV_PIX_FMT_YUYV422?
>

Seems so, I don't know, the spec is extremely vague:
"VK_FORMAT_G16B16G16R16_422_UNORM specifies a four-component, 64-bit format
containing a pair of G components, an R component, and a B component,
collectively encoding a 2×1 rectangle of unsigned normalized RGB texel
data. One G value is present at each *i* coordinate, with the B and R
values shared across both G values and thus recorded at half the horizontal
resolution of the image. This format has a 16-bit G component for the even
*i* coordinate in the word in bytes 0..1, a 16-bit B component in the word
in bytes 2..3, a 16-bit G component for the odd *i* coordinate in the word
in bytes 4..5, and a 16-bit R component in the word in bytes 6..7. Images
in this format *must* be defined with a width that is a multiple of two.
For the purposes of the constraints on copy extents, this format is treated
as a compressed format with a 2×1 compressed texel block."



> Changing that makes it accept a DRM object for this format
> (DRM_FORMAT_YUYV), though with anv we then get an assertion in the driver:
>
> $ gdb --args ./ffmpeg_g -v 55 -y -hwaccel vaapi -hwaccel_output_format
> vaapi -hwaccel_device /dev/dri/renderD128 -i in.mp4 -an -vf
> 'scale_vaapi=format=yuyv422,hwmap=derive_device=vulkan,scale
> _vulkan=1280:720,hwmap=derive_device=vaapi:reverse=1' -c:v h264_vaapi
> out.mp4
> ...
> [Parsed_scale_vaapi_0 @ 0x555558dd4d40] Filter output: vaapi_vld,
> 1920x1080 (2000).
> [hwmap @ 0x555558dd9380] Filter input: vaapi_vld, 1920x1080 (2000).
> ffmpeg_g: ../../../src/intel/vulkan/anv_image.c:599: anv_image_create:
> Assertion `format != NULL' failed.
>
> Thread 1 "ffmpeg_g" received signal SIGABRT, Aborted.
> __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> 51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x00007ffff3127231 in __GI_abort () at abort.c:79
> #2  0x00007ffff311e9da in __assert_fail_base (fmt=0x7ffff3271d48
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion at entry
> =0x7fffafd06a51 "format != NULL", file=file at entry=0x7fffafd05d38
> "../../../src/intel/vulkan/anv_image.c", line=line at entry=599,
>     function=function at entry=0x7fffafd07000 <__PRETTY_FUNCTION__.66498>
> "anv_image_create") at assert.c:92
> #3  0x00007ffff311ea52 in __GI___assert_fail (assertion=assertion at entry
> =0x7fffafd06a51 "format != NULL", file=file at entry=0x7fffafd05d38
> "../../../src/intel/vulkan/anv_image.c", line=line at entry=599,
> function=function at entry=0x7fffafd07000 <__PRETTY_FUNCTION__.66498>
> "anv_image_create") at assert.c:101
> #4  0x00007fffaf928190 in anv_image_create (_device=<optimized out>,
> create_info=create_info at entry=0x7fffffffced0, alloc=<optimized out>,
> pImage=<optimized out>) at ../../../src/intel/vulkan/anv_image.c:599
> #5  0x00007fffaf9282b2 in anv_CreateImage (device=<optimized out>,
> pCreateInfo=<optimized out>, pAllocator=<optimized out>, pImage=<optimized
> out>) at ../../../src/intel/vulkan/anv_image.c:641
> #6  0x00007ffff66776c4 in vkCreateImage (device=0x5555592a64e0,
> pCreateInfo=0x7fffffffcf70, pAllocator=0x0, pImage=0x5555592b9f40) at
> /home/mrt/video/vulkan/loader/loader/trampoline.c:1328
> #7  0x0000555556a009c7 in create_frame (hwfc=0x5555591e7380,
> frame=0x7fffffffd1a8, tiling=VK_IMAGE_TILING_OPTIMAL, usage=15,
> disjoint=0, create_pnext=0x7fffffffd140, alloc_pnext=0x7fffffffd080,
> alloc_pnext_stride=24) at src/libavutil/hwcontext_vulkan.c:1130
> #8  0x0000555556a0199b in vulkan_map_from_drm_frame_desc
> (hwfc=0x5555591e7380, f=0x7fffffffd1a8, desc=0x5555592b97c0) at
> src/libavutil/hwcontext_vulkan.c:1480
> #9  0x0000555556a019f8 in vulkan_map_from_drm (hwfc=0x5555591e7380,
> dst=0x5555592ab440, src=0x5555592b9580, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1502
> #10 0x0000555556a01b50 in vulkan_map_from_vaapi (dst_fc=0x5555591e7380,
> dst=0x5555592ab440, src=0x5555592ab700, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1550
> #11 0x0000555556a01c27 in vulkan_map_to (hwfc=0x5555591e7380,
> dst=0x5555592ab440, src=0x5555592ab700, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1579
> #12 0x00005555569f32c7 in av_hwframe_map (dst=0x5555592ab440,
> src=0x5555592ab700, flags=3) at src/libavutil/hwcontext.c:792
> #13 0x0000555555798c2c in hwmap_filter_frame (link=0x555558dd9a00,
> input=0x5555592ab700) at src/libavfilter/vf_hwmap.c:339
> #14 0x00005555556af6c9 in ff_filter_frame_framed (link=0x555558dd9a00,
> frame=0x5555592ab700) at src/libavfilter/avfilter.c:1071
> #15 0x00005555556aff52 in ff_filter_frame_to_filter (link=0x555558dd9a00)
> at src/libavfilter/avfilter.c:1219
> #16 0x00005555556b014e in ff_filter_activate_default
> (filter=0x555558dd9580) at src/libavfilter/avfilter.c:1268
> #17 0x00005555556b0372 in ff_filter_activate (filter=0x555558dd9580) at
> src/libavfilter/avfilter.c:1429
> #18 0x00005555556b5036 in ff_filter_graph_run_once (graph=0x555558dd9440)
> at src/libavfilter/avfiltergraph.c:1454
> #19 0x00005555556b6466 in push_frame (graph=0x555558dd9440) at
> src/libavfilter/buffersrc.c:181
> #20 0x00005555556b6778 in av_buffersrc_add_frame_internal
> (ctx=0x555558ddc3c0, frame=0x55555842b080, flags=4) at
> src/libavfilter/buffersrc.c:255
> #21 0x00005555556b63ed in av_buffersrc_add_frame_flags
> (ctx=0x555558ddc3c0, frame=0x55555842b080, flags=4) at
> src/libavfilter/buffersrc.c:164
> #22 0x0000555555679212 in ifilter_send_frame (ifilter=0x5555581f2580,
> frame=0x55555842b080) at src/fftools/ffmpeg.c:2190
> #23 0x00005555556794f2 in send_frame_to_filters (ist=0x5555581f4340,
> decoded_frame=0x55555842b080) at src/fftools/ffmpeg.c:2264
> #24 0x000055555567a2ac in decode_video (ist=0x5555581f4340,
> pkt=0x7fffffffd800, got_output=0x7fffffffd7f4, duration_pts=0x7fffffffd7f8,
> eof=0, decode_failed=0x7fffffffd7f0) at src/fftools/ffmpeg.c:2465
> #25 0x000055555567ac47 in process_input_packet (ist=0x5555581f4340,
> pkt=0x7fffffffd9c0, no_eof=0) at src/fftools/ffmpeg.c:2619
> #26 0x0000555555681bb5 in process_input (file_index=0) at
> src/fftools/ffmpeg.c:4457
> #27 0x00005555556820c4 in transcode_step () at src/fftools/ffmpeg.c:4577
> #28 0x00005555556821f1 in transcode () at src/fftools/ffmpeg.c:4631
> #29 0x0000555555682a81 in main (argc=18, argv=0x7fffffffe3c8) at
> src/fftools/ffmpeg.c:4838
>
>
> (For the above case you also need this VAAPI patch to give you composed
> layers:
>
> diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c
> index a2387d4fc4..8e6abdc6ca 100644
> --- a/libavutil/hwcontext_vaapi.c
> +++ b/libavutil/hwcontext_vaapi.c
> @@ -1104,7 +1104,7 @@ static int vaapi_map_to_drm_esh(AVHWFramesContext
> *hwfc, AVFrame *dst,
>
>      surface_id = (VASurfaceID)(uintptr_t)src->data[3];
>
> -    export_flags = VA_EXPORT_SURFACE_SEPARATE_LAYERS;
> +    export_flags = VA_EXPORT_SURFACE_COMPOSED_LAYERS;
>      if (flags & AV_HWFRAME_MAP_READ)
>          export_flags |= VA_EXPORT_SURFACE_READ_ONLY;
>      if (flags & AV_HWFRAME_MAP_WRITE)
>
> )
>
>
Maybe hwcontext_vaapi should be changed to give you composed layers if the
pixfmt demands it?


> +
> > +    /* 420 */
> > +    [AV_PIX_FMT_YUV420P]   = VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM,
> > +    [AV_PIX_FMT_YUV420P16] = VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM,
> > +
> > +    /* 422 */
> > +    [AV_PIX_FMT_YUV422P]   = VK_FORMAT_G8_B8_R8_3PLANE_422_UNORM,
> > +    [AV_PIX_FMT_YUV422P16] = VK_FORMAT_G16_B16_R16_3PLANE_422_UNORM,
> > +
> > +    /* 444 */
> > +    [AV_PIX_FMT_YUV444P]   = VK_FORMAT_G8_B8_R8_3PLANE_444_UNORM,
> > +    [AV_PIX_FMT_YUV444P16] = VK_FORMAT_G16_B16_R16_3PLANE_444_UNORM,
> > +
> > +    /* RGB */
> > +    [AV_PIX_FMT_ABGR]      = VK_FORMAT_A8B8G8R8_UNORM_PACK32,
> > +    [AV_PIX_FMT_BGRA]      = VK_FORMAT_B8G8R8A8_UNORM,
> > +    [AV_PIX_FMT_RGBA]      = VK_FORMAT_R8G8B8A8_UNORM,
> > +    [AV_PIX_FMT_RGB24]     = VK_FORMAT_R8G8B8_UNORM,
> > +    [AV_PIX_FMT_BGR24]     = VK_FORMAT_B8G8R8_UNORM,
> > +    [AV_PIX_FMT_RGB48]     = VK_FORMAT_R16G16B16_UNORM,
> > +    [AV_PIX_FMT_RGBA64]    = VK_FORMAT_R16G16B16A16_UNORM,
> > +    [AV_PIX_FMT_RGB565]    = VK_FORMAT_R5G6B5_UNORM_PACK16,
> > +    [AV_PIX_FMT_BGR565]    = VK_FORMAT_B5G6R5_UNORM_PACK16,
> > +    [AV_PIX_FMT_BGR0]      = VK_FORMAT_B8G8R8A8_UNORM,
> > +    [AV_PIX_FMT_0BGR]      = VK_FORMAT_A8B8G8R8_UNORM_PACK32,
> > +    [AV_PIX_FMT_RGB0]      = VK_FORMAT_R8G8B8A8_UNORM,
> > +};
> > +
> > +enum VulkanExtensions {
> > +    EXT_DEDICATED_ALLOC        = 1LL <<  0, /*
> VK_KHR_dedicated_allocation */
> > +    EXT_IMAGE_FORMAT_LIST      = 1LL <<  1, /* VK_KHR_image_format_list
> */
> > +    EXT_EXTERNAL_MEMORY        = 1LL <<  2, /* VK_KHR_external_memory */
> > +    EXT_EXTERNAL_HOST_MEMORY   = 1LL <<  3, /*
> VK_EXT_external_memory_host */
> > +    EXT_EXTERNAL_FD_MEMORY     = 1LL <<  4, /*
> VK_KHR_external_memory_fd */
> > +    EXT_EXTERNAL_DMABUF_MEMORY = 1LL <<  5, /*
> VK_EXT_external_memory_dma_buf */
> > +    EXT_DRM_MODIFIER_FLAGS     = 1LL <<  6, /*
> VK_EXT_image_drm_format_modifier */
> > +    EXT_YUV_IMAGES             = 1LL <<  7, /*
> VK_KHR_sampler_ycbcr_conversion */
> > +
> > +    EXT_OPTIONAL               = 1LL << 62,
> > +    EXT_REQUIRED               = 1LL << 63,
>
> That's signed overflow -> undefined behaviour.  Since you want a uint64_t,
> use UINT64_C().
>
>
Changed to 1ULL. We discussed it last time, UINT64_C does exactly that.


> +};
> > +
> > ...
> > +
> > +static VkBool32 vk_dbg_callback(VkDebugUtilsMessageSeverityFlagBitsEXT
> severity,
> > +                                VkDebugUtilsMessageTypeFlagsEXT
> messageType,
> > +                                const VkDebugUtilsMessengerCallbackDataEXT
> *data,
> > +                                void *priv)
> > +{
> > +    int l;
> > +    AVHWDeviceContext *ctx = priv;
> > +
> > +    switch (severity) {
> > +    case VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT: l =
> AV_LOG_VERBOSE; break;
> > +    case VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT:    l =
> AV_LOG_INFO;    break;
> > +    case VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT: l =
> AV_LOG_WARNING; break;
> > +    case VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT:   l =
> AV_LOG_ERROR;   break;
> > +    default:                                              l =
> AV_LOG_DEBUG;   break;
> > +    };
>
> Stray semicolon.
>

Fixed.



> > +
> > +    av_log(ctx, l, "%s\n", data->pMessage);
> > +    for (int i = 0; i < data->cmdBufLabelCount; i++)
> > +        av_log(ctx, l, "\t%i: %s\n", i, data->pCmdBufLabels[i].pLabelN
> ame);
> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +typedef struct VulkanDeviceSelection {
> > +    const char *name; /* Will use this first unless NULL */
> > +    uint32_t pci_device; /* Will use this second unless 0x0 */
> > +    uint32_t vendor_id; /* Last resort to find something deterministic
> */
> > +    int index; /* Finally fall back to index */
> > +} VulkanDeviceSelection;
> > +
> > +/* Finds a device */
> > +static int find_device(AVHWDeviceContext *ctx, VulkanDeviceSelection
> *select)
> > +{
> > +    int err = 0;
> > +    uint32_t num;
> > +    VkResult ret;
> > +    VkPhysicalDevice *devices = NULL;
> > +    VkPhysicalDeviceProperties *prop = NULL;
> > +    VkPhysicalDevice choice = VK_NULL_HANDLE;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    static const char *dev_types[] = {
> > +        [VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU] = "integrated",
> > +        [VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU]   = "discrete",
> > +        [VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU]    = "virtual",
> > +        [VK_PHYSICAL_DEVICE_TYPE_CPU]            = "software",
> > +        [VK_PHYSICAL_DEVICE_TYPE_OTHER]          = "unknown",
> > +    };
> > +
> > +    ret = vkEnumeratePhysicalDevices(hwctx->inst, &num, NULL);
> > +    if (ret != VK_SUCCESS || !num) {
> > +        av_log(ctx, AV_LOG_ERROR, "No devices found: %s!\n",
> vk_ret2str(ret));
> > +        return AVERROR_EXTERNAL;
>
> AVERROR(ENODEV) might be clearer, and in similar "no device" cases below
> too.
>

Fixed, also in other places in that function.



> > +    }
> > +
> > +    devices = av_malloc_array(num, sizeof(VkPhysicalDevice));
> > +    if (!devices)
> > +        return AVERROR(ENOMEM);
> > +
> > +    ret = vkEnumeratePhysicalDevices(hwctx->inst, &num, devices);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed enumerating devices: %s\n",
> > +               vk_ret2str(ret));
> > +        err = AVERROR_EXTERNAL;
> > +        goto end;
> > +    }
> > +
> > +    prop = av_malloc_array(num, sizeof(VkPhysicalDeviceProperties));
> > +    if (!prop) {
> > +        err = AVERROR(ENOMEM);
> > +        goto end;
> > +    }
> > +
> > +    av_log(ctx, AV_LOG_VERBOSE, "GPU listing:\n");
> > +    for (int i = 0; i < num; i++) {
> > +        vkGetPhysicalDeviceProperties(devices[i], &prop[i]);
> > +        av_log(ctx, AV_LOG_VERBOSE, "    %d: %s (%s) (0x%x)\n", i,
> prop[i].deviceName,
>                                                          ^ "%#x" (and
> below)
>
> > +               dev_types[prop[i].deviceType], prop[i].deviceID);
>
> dev_types would feel safer as a function, I think?  (If a later Vulkan
> version adds a new device type then you can crash if you see it.)
>

Done.



> > +    }
> > +
> > +    if (select->name) {
> > +        av_log(ctx, AV_LOG_VERBOSE, "Requested device: %s\n",
> select->name);
> > +        for (int i = 0; i < num; i++) {
> > +            if (strcmp(select->name, prop[i].deviceName) == 0) {
>
> Might it be nicer to use strstr() rather than strcmp() here?
>
> The requirement to put e.g. "AMD RADV POLARIS11 (LLVM 6.0.0)" is pretty
> annoying, especially when that string might change between versions.  If
> you know you have an AMD card and an Intel card, then matching "AMD" seems
> pretty safe.
>
>
Done. Sadly its not case-independent but oh well.


> +                choice = devices[i];
> > +                goto end;
> > +             }
> > +        }
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to find device \"%s\"!\n",
> > +               select->name);
> > +        err = AVERROR_UNKNOWN;
> > +        goto end;
> > +    } else if (select->pci_device) {
> > +        av_log(ctx, AV_LOG_VERBOSE, "Requested device: 0x%x\n",
> select->pci_device);
> > +        for (int i = 0; i < num; i++) {
> > +            if (select->pci_device == prop[i].deviceID) {
> > +                choice = devices[i];
> > +                goto end;
> > +            }
> > +        }
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to find device with PCI ID
> 0x%x!\n",
> > +               select->pci_device);
>
> I wonder whether this should have some magic if you have multiple of the
> same graphics card (something with opts, maybe?).  That will be a common
> case in compute, though I don't know if it matters here.
>

We could fix it later if the API introduces something better or at least a
way to translate UUIDs.



> > +        err = AVERROR(EINVAL);
> > +        goto end;
> > +    } else if (select->vendor_id) {
> > +        av_log(ctx, AV_LOG_VERBOSE, "Requested vendor: 0x%x\n",
> select->vendor_id);
> > +        for (int i = 0; i < num; i++) {
> > +            if (select->vendor_id == prop[i].vendorID) {
> > +                choice = devices[i];
> > +                goto end;
> > +            }
> > +        }
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to find device with Vendor ID
> 0x%x!\n",
> > +               select->vendor_id);
> > +        err = AVERROR_UNKNOWN;
> > +        goto end;
> > +    } else {
> > +        if (select->index < num) {
> > +            choice = devices[select->index];
> > +            goto end;
> > +        }
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to find device with index
> %i!\n",
> > +               select->index);
> > +        err = AVERROR_UNKNOWN;
> > +        goto end;
> > +    }
> > +
> > +end:
> > +    av_free(devices);
> > +    av_free(prop);
> > +    hwctx->phys_dev = choice;
> > +
> > +    return err;
> > +}
> > +
> > +static int search_queue_families(AVHWDeviceContext *ctx,
> VkDeviceCreateInfo *cd)
> > +{
> > +    uint32_t num;
> > +    VkQueueFamilyProperties *qs = NULL;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    int graph_index = -1, comp_index = -1, tx_index = -1;
> > +    VkDeviceQueueCreateInfo *pc = (VkDeviceQueueCreateInfo
> *)cd->pQueueCreateInfos;
> > +
> > +    /* First get the number of queue families */
> > +    vkGetPhysicalDeviceQueueFamilyProperties(hwctx->phys_dev, &num,
> NULL);
> > +    if (!num) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed to get queues!\n");
> > +        return AVERROR_EXTERNAL;
> > +    }
> > +
> > +    /* Then allocate memory */
> > +    qs = av_malloc_array(num, sizeof(VkQueueFamilyProperties));
> > +    if (!qs)
> > +        return AVERROR(ENOMEM);
> > +
> > +    /* Finally retrieve the queue families */
> > +    vkGetPhysicalDeviceQueueFamilyProperties(hwctx->phys_dev, &num,
> qs);
> > +
> > +#define SEARCH_FLAGS(expr, out)
>         \
> > +    for (int i = 0; i < num; i++) {
>             \
>
> Adding the "int " pushed the "\" out of alignment :P
>

Fixed.



> > +        const VkQueueFlagBits flags = qs[i].queueFlags;
>         \
> > +        if (expr) {
>         \
> > +            out = i;
>        \
> > +            break;
>        \
> > +        }
>         \
> > +    }
> > +
> > +    if (!hwctx->queue_family_index)
>
> I don't quite understand what this test is doing.  You search for the
> queues to use on device create (not external init), so it should always be
> unset when you get here?
>

Right, I didn't know about this until I used the API a week ago to write a
demo client for a wayland surface capture (which imported DMABUFs, mapped
them to whatever and encoded them) protocol.
Fixed, as well as in other places. Also I verify the queue index in the
init function now.


> > +        SEARCH_FLAGS(flags & VK_QUEUE_GRAPHICS_BIT, graph_index)
> > +
> > +    if (!hwctx->queue_family_comp_index)
> > +        SEARCH_FLAGS((flags &  VK_QUEUE_COMPUTE_BIT) && (i !=
> graph_index),
> > +                     comp_index)
> > +
> > +    if (!hwctx->queue_family_tx_index)
> > +        SEARCH_FLAGS((flags & VK_QUEUE_TRANSFER_BIT) && (i !=
> graph_index) &&
> > +                     (i != comp_index), tx_index)
> > +
> > +#undef SEARCH_FLAGS
> > +#define QF_FLAGS(flags)
>         \
> > +    ((flags) & VK_QUEUE_GRAPHICS_BIT      ) ? "(graphics) " : "",
>         \
> > +    ((flags) & VK_QUEUE_COMPUTE_BIT       ) ? "(compute) "  : "",
>         \
> > +    ((flags) & VK_QUEUE_TRANSFER_BIT      ) ? "(transfer) " : "",
>         \
> > +    ((flags) & VK_QUEUE_SPARSE_BINDING_BIT) ? "(sparse) "   : ""
> > +
> > +    av_log(ctx, AV_LOG_VERBOSE, "Using queue family %i for graphics, "
> > +           "flags: %s%s%s%s\n", graph_index,
> QF_FLAGS(qs[graph_index].queueFlags));
> > +
> > +    hwctx->queue_family_index      = graph_index;
> > +    hwctx->queue_family_tx_index   = graph_index;
> > +    hwctx->queue_family_comp_index = graph_index;
> > +
> > +    pc[cd->queueCreateInfoCount++].queueFamilyIndex = graph_index;
> > +
> > +    if (comp_index != -1) {
> > +        av_log(ctx, AV_LOG_VERBOSE, "Using queue family %i for compute,
> "
> > +               "flags: %s%s%s%s\n", comp_index,
> QF_FLAGS(qs[comp_index].queueFlags));
> > +        hwctx->queue_family_tx_index                    = comp_index;
> > +        hwctx->queue_family_comp_index                  = comp_index;
> > +        pc[cd->queueCreateInfoCount++].queueFamilyIndex = comp_index;
> > +    }
> > +
> > +    if (tx_index != -1) {
> > +        av_log(ctx, AV_LOG_VERBOSE, "Using queue family %i for
> transfers, "
> > +               "flags: %s%s%s%s\n", tx_index,
> QF_FLAGS(qs[tx_index].queueFlags));
> > +        hwctx->queue_family_tx_index                    = tx_index;
> > +        pc[cd->queueCreateInfoCount++].queueFamilyIndex = tx_index;
> > +    }
> > +
> > +#undef PRINT_QF_FLAGS
>
> "QF_FLAGS".
>

Fixed.



> > +
> > +    av_free(qs);
> > +
> > +    return 0;
> > +}
> > +
> > +static int create_exec_ctx(AVHWDeviceContext *ctx)
> > +{
> > +    VkResult ret;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > +    VkCommandPoolCreateInfo cqueue_create = {
> > +        .sType              = VK_STRUCTURE_TYPE_COMMAND_POOL
> _CREATE_INFO,
> > +        .flags              = VK_COMMAND_POOL_CREATE_RESET_C
> OMMAND_BUFFER_BIT,
> > +        .queueFamilyIndex   = hwctx->queue_family_tx_index,
> > +    };
> > +    VkCommandBufferAllocateInfo cbuf_create = {
> > +        .sType              = VK_STRUCTURE_TYPE_COMMAND_BUFF
> ER_ALLOCATE_INFO,
> > +        .level              = VK_COMMAND_BUFFER_LEVEL_PRIMARY,
> > +        .commandBufferCount = 1,
> > +    };
> > +    VkFenceCreateInfo fence_spawn = { VK_STRUCTURE_TYPE_FENCE_CREATE_INFO
> };
> > +
> > +    ret = vkCreateCommandPool(hwctx->act_dev, &cqueue_create,
> > +                              hwctx->alloc, &p->cmd_pool);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Command pool creation failure: %s\n",
> > +               vk_ret2str(ret));
> > +        return 1;
>
> These failures are going to return a nonnegative number from
> device_init(); I don't think that's wanted.
>

Fixed, replaced with AVERROR_EXTERNAL.



> > +    }
> > +
> > +    cbuf_create.commandPool = p->cmd_pool;
> > +
> > +    ret = vkAllocateCommandBuffers(hwctx->act_dev, &cbuf_create,
> &p->cmd_buf);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Command buffer alloc failure: %s\n",
> > +               vk_ret2str(ret));
> > +        return 1;
> > +    }
> > +
> > +    ret = vkCreateFence(hwctx->act_dev, &fence_spawn,
> > +                        hwctx->alloc, &p->cmd_fence);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed to create frame fence: %s\n",
> > +               vk_ret2str(ret));
> > +        return 1;
> > +    }
> > +
> > +    vkGetDeviceQueue(hwctx->act_dev, hwctx->queue_family_tx_index, 0,
> > +                     &p->cmd_queue);
> > +
> > +    return 0;
> > +}
> > +
> > +static void free_exec_ctx(AVHWDeviceContext *ctx)
> > +{
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > +    if (!p)
>
> This can never be true - priv is set before the free function.
>

Fixed.



> > +        return;
> > +
> > +    if (p->cmd_fence != VK_NULL_HANDLE)
>
> Since we're depending on VK_NULL_HANDLE being zero for correct
> initialisation, maybe just treat these as pointers and write "if
> (p->cmd_fence)", etc.
>

Done (and in other places in this file).



> > +        vkDestroyFence(hwctx->act_dev, p->cmd_fence, hwctx->alloc);
> > +    if (p->cmd_buf != VK_NULL_HANDLE)
> > +        vkFreeCommandBuffers(hwctx->act_dev, p->cmd_pool, 1,
> &p->cmd_buf);
> > +    if (p->cmd_pool != VK_NULL_HANDLE)
> > +        vkDestroyCommandPool(hwctx->act_dev, p->cmd_pool,
> hwctx->alloc);
> > +}
> > +
> > +static void vulkan_device_free(AVHWDeviceContext *ctx)
> > +{
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > +    free_exec_ctx(ctx);
> > +
> > +    vkDestroyDevice(hwctx->act_dev, hwctx->alloc);
> > +
> > +    if (p && p->debug_ctx != VK_NULL_HANDLE) {
> > +        VK_LOAD_PFN(hwctx->inst, vkDestroyDebugUtilsMessengerEXT);
> > +        pfn_vkDestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
> > +                                            hwctx->alloc);
> > +    }
> > +
> > +    vkDestroyInstance(hwctx->inst, hwctx->alloc);
> > +}
> > +
> > +static int vulkan_device_create_internal(AVHWDeviceContext *ctx,
> > +                                         VulkanDeviceSelection
> *dev_select,
> > +                                         AVDictionary *opts, int flags)
> > +{
> > +    int err = 0;
> > +    VkResult ret;
> > +    AVDictionaryEntry *opt_d;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VkDeviceQueueCreateInfo queue_create_info[3] = {
> > +        {   .sType            = VK_STRUCTURE_TYPE_DEVICE_QUEUE
> _CREATE_INFO,
> > +            .pQueuePriorities = (float []){ 1.0f },
> > +            .queueCount       = 1, },
> > +        {   .sType            = VK_STRUCTURE_TYPE_DEVICE_QUEUE
> _CREATE_INFO,
> > +            .pQueuePriorities = (float []){ 1.0f },
> > +            .queueCount       = 1, },
> > +        {   .sType            = VK_STRUCTURE_TYPE_DEVICE_QUEUE
> _CREATE_INFO,
> > +            .pQueuePriorities = (float []){ 1.0f },
> > +            .queueCount       = 1, },
> > +    };
> > +
> > +    VkDeviceCreateInfo dev_info = {
> > +        .sType                = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
> > +        .pQueueCreateInfos    = queue_create_info,
> > +        .queueCreateInfoCount = 0,
> > +    };
> > +
> > +    VulkanDevicePriv *p = av_mallocz(sizeof(*p));
> > +    if (!p) {
> > +        err = AVERROR(ENOMEM);
> > +        goto fail;
> > +    }
>
> This is already allocated inside av_hwdevice_ctx_alloc(), you've
> overwriting it here and leaking the original.
>

Fixed.



> > +
> > +    ctx->internal->priv = p;
> > +    ctx->free           = vulkan_device_free;
> > +
> > +    /* Create an instance if not given one */
> > +    if (!hwctx->inst && (err = create_instance(ctx, opts)))
> > +        goto fail;
> > +
> > +    /* Find a device (if not given one) */
> > +    if (!hwctx->phys_dev && (err = find_device(ctx, dev_select)))
> > +        goto fail;
> > +
> > +    vkGetPhysicalDeviceProperties(hwctx->phys_dev, &p->props);
> > +    av_log(ctx, AV_LOG_VERBOSE, "Using device: %s\n",
> p->props.deviceName);
> > +    av_log(ctx, AV_LOG_VERBOSE, "Alignments:\n");
> > +    av_log(ctx, AV_LOG_VERBOSE, "    optimalBufferCopyOffsetAlignment:
>  %li\n",
> > +           p->props.limits.optimalBufferCopyOffsetAlignment);
> > +    av_log(ctx, AV_LOG_VERBOSE, "    optimalBufferCopyRowPitchAlignment:
> %li\n",
> > +           p->props.limits.optimalBufferCopyRowPitchAlignment);
> > +    av_log(ctx, AV_LOG_VERBOSE, "    minMemoryMapAlignment:
>   %li\n",
> > +           p->props.limits.minMemoryMapAlignment);
> > +
> > +    /* Search queue family */
> > +    if ((err = search_queue_families(ctx, &dev_info)))
> > +        goto fail;
> > +
> > +    if (!hwctx->act_dev) {
> > +        err = check_extensions(ctx, 1, &dev_info.ppEnabledExtensionNa
> mes,
> > +                               &dev_info.enabledExtensionCount, 0);
> > +        if (err)
> > +            goto fail;
> > +
> > +        ret = vkCreateDevice(hwctx->phys_dev, &dev_info,
> > +                             hwctx->alloc, &hwctx->act_dev);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(ctx, AV_LOG_ERROR, "Device creation failure: %s\n",
> > +                   vk_ret2str(ret));
> > +            err = AVERROR_EXTERNAL;
> > +            goto fail;
> > +        }
> > +
> > +        av_free((void *)dev_info.ppEnabledExtensionNames);
> > +    }
> > +
> > +    /* Tiled images setting, use them by default */
> > +    opt_d = av_dict_get(opts, "linear_images", NULL, 0);
> > +    if (opt_d)
> > +        p->use_linear_images = strtol(opt_d->value, NULL, 10);
> > +
> > +    /* Disjoint images setting, don't use them by default */
> > +    opt_d = av_dict_get(opts, "disjoint_images", NULL, 0);
> > +    if (opt_d)
> > +        p->use_disjoint_images = strtol(opt_d->value, NULL, 10);
> > +
> > +    return 0;
> > +
> > +fail:
> > +    av_freep(&ctx->internal->priv);
>
> I don't think you want to free this here, it's managed by the hwcontext
> layer.
>

Fixed.



> > +    return err;
> > +}
> > +
> > +static int vulkan_device_init(AVHWDeviceContext *ctx)
> > +{
> > +    int err;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > +    /* Create exec context - if there's something invalid this will
> error out */
> > +    err = create_exec_ctx(ctx);
> > +    if (err)
> > +        return err;
> > +
> > +    /* Get device capabilities */
> > +    vkGetPhysicalDeviceMemoryProperties(hwctx->phys_dev, &p->mprops);
> > +
> > +    return 0;
> > +}
> > +
> > +static int vulkan_device_create(AVHWDeviceContext *ctx, const char
> *device,
> > +                                AVDictionary *opts, int flags)
> > +{
> > +    VulkanDeviceSelection dev_select = { 0 };
> > +    if (device && device[0]) {
> > +        if (av_isdigit(device[0]))
>
> 3dfx probably wouldn't appreciate this test, though I admit they are
> unlikely to add Vulkan support to their cards.
>

> > +            dev_select.index = strtol(device, NULL, 10);
>
> Might be nicer to always call strtol and then check whether *end is zero,
> just in case of a future problem like that.
>

Done, replaced with:
        char *end = NULL;
        dev_select.index = strtol(device, &end, 10);
        if (end == device) {
            dev_select.index = 0;
            dev_select.name  = device;
        }


>
> > +        else
> > +            dev_select.name = device;
> > +    }
> > +
> > +    return vulkan_device_create_internal(ctx, &dev_select, opts,
> flags);
> > +}
> > +
> > +static int vulkan_device_derive(AVHWDeviceContext *ctx,
> > +                                AVHWDeviceContext *src_ctx, int flags)
> > +{
> > +    VulkanDeviceSelection dev_select = { 0 };
> > +
> > +    switch(src_ctx->type) {
> > +#if CONFIG_LIBDRM
> > +#if CONFIG_VAAPI
> > +    case AV_HWDEVICE_TYPE_VAAPI: {
> > +        AVVAAPIDeviceContext *src_hwctx = src_ctx->hwctx;
> > +        const char *vendor = vaQueryVendorString(src_hwctx->display);
> > +        if (!vendor) {
> > +            av_log(ctx, AV_LOG_ERROR, "Unable to get device info from
> vaapi!\n");
> > +            return AVERROR_EXTERNAL;
> > +        }
> > +
> > +        if (strstr(vendor, "Intel"))
> > +            dev_select.vendor_id = 0x8086;
> > +        if (strstr(vendor, "AMD"))
> > +            dev_select.vendor_id = 0x1002;
> > +
> > +        return vulkan_device_create_internal(ctx, &dev_select, NULL,
> flags);
>
> Did you think about making an addition to VAAPI which could do this in a
> more sensible way?
>

No, not yet. If anything better comes up we can replace it.



> > +    }
> > +#endif
> > +    case AV_HWDEVICE_TYPE_DRM: {
> > +        AVDRMDeviceContext *src_hwctx = src_ctx->hwctx;
> > +
> > +        drmDevice *drm_dev_info;
> > +        int err = drmGetDevice(src_hwctx->fd, &drm_dev_info);
> > +        if (err) {
> > +            av_log(ctx, AV_LOG_ERROR, "Unable to get device info from
> drm fd!\n");
> > +            return AVERROR_EXTERNAL;
> > +        }
> > +
> > +        dev_select.pci_device = drm_dev_info->deviceinfo.pci->
> device_id;
>
> Not all devices are PCI, check bustype before using this.  I don't know
> what information you can use in other cases, though (GPUs on mobile will
> just be opaque platform devices).
>

Done.



> The drmDevice structure needs to be freed, too (drmFreeDevice).
>

Fixed.



> > +
> > +        return vulkan_device_create_internal(ctx, &dev_select, NULL,
> flags);
> > +    }
> > +#endif
> > +    default:
> > +        return AVERROR(ENOSYS);
> > +    }
> > +}
> > +
> > +static int vulkan_frames_get_constraints(AVHWDeviceContext *ctx,
> > +                                         const void *hwconfig,
> > +                                         AVHWFramesConstraints
> *constraints)
> > +{
> > +    int count = 0;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > +    for (enum AVPixelFormat i = 0; i < AV_PIX_FMT_NB; i++)
>
> This iteration feels dubious, maybe it would be better to use
> av_pix_fmt_desc_next()?
>

Its fine, this is lavu, there's no way for AV_PIX_FMT_NB to be different.



> > +        count += vkfmt_is_supported(hwctx, i, p->use_linear_images);
> > +
> > +    constraints->valid_sw_formats = av_malloc_array(count + 1,
> > +                                                    sizeof(enum
> AVPixelFormat));
> > +    if (!constraints->valid_sw_formats)
> > +        return AVERROR(ENOMEM);
> > +
> > +    count = 0;
> > +    for (enum AVPixelFormat i = 0; i < AV_PIX_FMT_NB; i++)
> > +        if (vkfmt_is_supported(hwctx, i, p->use_linear_images))
> > +            constraints->valid_sw_formats[count++] = i;
> > +    constraints->valid_sw_formats[count++] = AV_PIX_FMT_NONE;
> > +
> > +    constraints->min_width  = 0;
> > +    constraints->min_height = 0;
>
> Not directly related to this, but I was trying because of it: I note that
> ANV returns:
>
> [AVHWDeviceContext @ 0x555558792a40] Image creation failure:
> VK_ERROR_OUT_OF_DEVICE_MEMORY
>
> with a 1x1 YUV420P image, which is a pretty opaque failure (works for 2x2
> or 3x3).  What are the requirements there?  Who should be checking it?
>

There are no minimum dimension requirements exposed by the API. My guess is
the implementation is rejecting it for some reason.



> > +    constraints->max_width  = p->props.limits.maxImageDimension2D;
> > +    constraints->max_height = p->props.limits.maxImageDimension2D;
> > +
> > +    constraints->valid_hw_formats = av_malloc_array(2, sizeof(enum
> AVPixelFormat));
> > +    if (!constraints->valid_hw_formats)
> > +        return AVERROR(ENOMEM);
> > +
> > +    constraints->valid_hw_formats[0] = AV_PIX_FMT_VULKAN;
> > +    constraints->valid_hw_formats[1] = AV_PIX_FMT_NONE;
> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +static int alloc_bind_mem(AVHWDeviceContext *ctx, AVVkFrame *f,
> > +                          void *alloc_pnext, size_t alloc_pnext_stride)
> > +{
> > +    int err;
> > +    VkResult ret;
> > +    VkBindImageMemoryInfo bind_info[AV_NUM_DATA_POINTERS] = { { 0 } };
> > +    VkBindImagePlaneMemoryInfo bind_p_info[AV_NUM_DATA_POINTERS] = { {
> 0 } };
> > +
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *p = ctx->internal->priv;
> > +
> > +    VK_LOAD_PFN(hwctx->inst, vkBindImageMemory2KHR);
> > +    VK_LOAD_PFN(hwctx->inst, vkGetImageMemoryRequirements2KHR);
>
> The presence of the relevant extension presumably means that these
> necessarily succeed?
>

Yep.



>
> > +
> > +    for (int i = 0; i < f->mem_count; i++) {
> > +        int use_ded_mem;
> > +        VkImagePlaneMemoryRequirementsInfo plane_req = {
> > +            .sType       = VK_STRUCTURE_TYPE_IMAGE_PLANE_
> MEMORY_REQUIREMENTS_INFO,
> > +            .planeAspect = i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > +                           i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > +                                    VK_IMAGE_ASPECT_PLANE_2_BIT,
> > +        };
> > +        VkImageMemoryRequirementsInfo2 req_desc = {
> > +            .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY
> _REQUIREMENTS_INFO_2,
> > +            .pNext = f->mem_count > 1 ? &plane_req : NULL,
> > +            .image = f->img,
> > +        };
> > +        VkMemoryDedicatedAllocateInfo ded_alloc = {
> > +            .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
> > +            .pNext = (void *)(((uint8_t *)alloc_pnext) +
> i*alloc_pnext_stride),
> > +        };
> > +        VkMemoryDedicatedRequirements ded_req = {
> > +            .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS,
> > +        };
> > +        VkMemoryRequirements2 req = {
> > +            .sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2,
> > +            .pNext = (p->extensions & EXT_DEDICATED_ALLOC) ? &ded_req :
> NULL,
> > +        };
> > +
> > +        pfn_vkGetImageMemoryRequirements2KHR(hwctx->act_dev,
> &req_desc, &req);
> > +
> > +        /* In case the implementation prefers/requires dedicated
> allocation */
> > +        use_ded_mem = ded_req.prefersDedicatedAllocation |
> > +                      ded_req.requiresDedicatedAllocation;
> > +        if (use_ded_mem)
> > +            ded_alloc.image = f->img;
> > +
> > +        /* Allocate memory */
> > +        if ((err = alloc_mem(ctx, &req.memoryRequirements,
> > +                             f->tiling == VK_IMAGE_TILING_LINEAR ?
> > +                             VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT :
> > +                             VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
> > +                             use_ded_mem ? &ded_alloc : (void
> *)ded_alloc.pNext,
> > +                             &f->flags, &f->mem[i])))
> > +            return err;
> > +
> > +        if (f->mem_count > 1) {
> > +            bind_p_info[i].sType = VK_STRUCTURE_TYPE_BIND_IMAGE_P
> LANE_MEMORY_INFO;
> > +            bind_p_info[i].planeAspect = plane_req.planeAspect;
> > +            bind_info[i].pNext = &bind_p_info[i];
> > +        }
> > +
> > +        bind_info[i].sType  = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO;
> > +        bind_info[i].image  = f->img;
> > +        bind_info[i].memory = f->mem[i];
> > +    }
> > +
> > +    /* Bind the allocated memory to the image */
> > +    ret = pfn_vkBindImageMemory2KHR(hwctx->act_dev, f->mem_count,
> bind_info);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed to bind memory: %s\n",
> > +               vk_ret2str(ret));
> > +        return AVERROR_EXTERNAL;
> > +    }
>
> This looks like it can leak some allocated memory if something goes wrong
> during bind or when allocating a plane after the first.  Or is there some
> magic which means it doesn't?
>
>
There's some magic which means it doesn't:

av_log(ctx, AV_LOG_ERROR, "Failed to bind memory: %s\n",
               vk_ret2str(ret));
        return AVERROR_EXTERNAL;

which goes to

 if ((err = alloc_bind_mem(ctx, f, alloc_pnext, alloc_pnext_stride)))
        goto fail;

which goes to
fail:
    vulkan_frame_free(hwctx, (uint8_t *)f);
    return err;



> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +static AVBufferRef *vulkan_pool_alloc(void *opaque, int size)
> > +{
> > +    int err;
> > +    AVVkFrame *f;
> > +    AVBufferRef *avbuf = NULL;
> > +    AVHWFramesContext *hwfc = opaque;
> > +    AVVulkanFramesContext *hwctx = hwfc->hwctx;
> > +    VkExportMemoryAllocateInfo einfo[AV_NUM_DATA_POINTERS];
> > +    VkExternalMemoryHandleTypeFlags e = 0x0;
> > +
> > +    try_export_flags(hwfc, &e, VK_EXTERNAL_MEMORY_HANDLE_TYPE
> _DMA_BUF_BIT_EXT);
>
> The intent of this is to allocate memory which is dma_buf inside the
> kernel and can therefore be exported as DRM objects?
>
> Have you tried making a map_from which uses that?  (That would allow hwmap
> Vulkan->VAAPI without reverse mapping, I guess.)
>

Yes, it works. I'm not sure why though. The spec says ownership is
transferred and that destroying the image and memory would do nothing,
you'd need to close the exported FDs. And I think that means closing the
FDs without destroying the image and memory would also make the memory and
image invalid, but that's not the case. It works fine, I'm not leaking any
file descriptors. I think the spec just omits to tell that if you close the
FD while the image is still alive it won't destroy the image.
Anyway, its in the new patch, along with VAAPI support.
Tested with:

./ffmpeg_g -init_hw_device "vaapi=vp:/dev/dri/renderD129" -i ~/dumper.mkv
-loglevel verbose -filter_hw_device vp -vf
hwupload,hwmap=derive_device=vulkan,chromaticaberration_vulkan,hwmap=reverse=1:derive_device=vaapi,format=vaapi
-c:v hevc_vaapi -f null -2



> > +
> > +    for (int i = 0; i < av_pix_fmt_count_planes(hwfc->sw_format); i++)
> {
> > +        einfo[i].sType       = VK_STRUCTURE_TYPE_EXPORT_MEMOR
> Y_ALLOCATE_INFO;
> > +        einfo[i].pNext       = hwctx->alloc_pnext[i];
> > +        einfo[i].handleTypes = e;
> > +    }
> > +
> > +    err = create_frame(hwfc, &f, hwctx->tiling, hwctx->usage,
> > +                       hwctx->disjoint, hwctx->create_pnext,
> > +                       einfo, sizeof(*einfo));
> > +    if (err)
> > +        return NULL;
> > +
> > +    avbuf = av_buffer_create((uint8_t *)f, sizeof(AVVkFrame),
> > +                             vulkan_frame_free,
> hwfc->device_ctx->hwctx, 0);
> > +    if (!avbuf) {
> > +        vulkan_frame_free(hwfc->device_ctx->hwctx, (uint8_t *)f);
> > +        return NULL;
> > +    }
> > +
> > +    return avbuf;
> > +}
> > +
> > +static int vulkan_frames_init(AVHWFramesContext *hwfc)
> > +{
> > +    AVVulkanFramesContext *hwctx = hwfc->hwctx;
> > +    VulkanDevicePriv *p = hwfc->device_ctx->internal->priv;
> > +
> > +    if (hwfc->pool)
> > +        return 0;
> > +
> > +    /* Default pool flags */
> > +    hwctx->tiling = hwctx->tiling ? hwctx->tiling :
> p->use_linear_images ?
> > +                    VK_IMAGE_TILING_LINEAR : VK_IMAGE_TILING_OPTIMAL;
> > +
> > +    hwctx->usage |= DEFAULT_USAGE_FLAGS;
> > +
> > +    hwctx->disjoint = hwctx->disjoint ? hwctx->disjoint :
> p->use_disjoint_images;
> > +
> > +    hwfc->internal->pool_internal = av_buffer_pool_init2(sizeof(AV
> VkFrame),
> > +                                                         hwfc,
> vulkan_pool_alloc,
> > +                                                         NULL);
> > +    if (!hwfc->internal->pool_internal)
> > +        return AVERROR(ENOMEM);
>
> This doesn't actually check anything about the parameters - e.g. I can
> make frames context with a crazy unsupported sw_format and it will return
> success.
>
> Is it sensible to, say, test-allocate a single frame to make sure it
> actually works?
>

Done.



> > +
> > +    return 0;
> > +}
> > +
> > +static int vulkan_get_buffer(AVHWFramesContext *hwfc, AVFrame *frame)
> > +{
> > +    frame->buf[0] = av_buffer_pool_get(hwfc->pool);
> > +    if (!frame->buf[0])
> > +        return AVERROR(ENOMEM);
> > +
> > +    frame->data[0] = frame->buf[0]->data;
> > +    frame->format  = AV_PIX_FMT_VULKAN;
> > +    frame->width   = hwfc->width;
> > +    frame->height  = hwfc->height;
> > +
> > +    return 0;
> > +}
> > +
> > +static int vulkan_transfer_get_formats(AVHWFramesContext *hwfc,
> > +                                       enum AVHWFrameTransferDirection
> dir,
> > +                                       enum AVPixelFormat **formats)
> > +{
> > +    int count = 0;
> > +    enum AVPixelFormat *pix_fmts = NULL;
> > +    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(hwfc->sw_f
> ormat);
> > +
> > +    /* All formats can be transferred to themselves */
> > +    count++;
> > +
> > +    /* All formats with a luma can have only that channel transferred */
> > +    count += !(desc->flags & AV_PIX_FMT_FLAG_RGB);
>
> In what cases is this acutally expected to work?
>

Works here.

./ffmpeg_g -init_hw_device "vulkan=vk:Intel" -i ~/dumper.mkv -loglevel
verbose -filter_hw_device vk -vf
format=yuv420p,hwupload,hwdownload,format=gray8
-f null -

Latest mesa git master.



> $ gdb --args ./ffmpeg_g -v 55 -y -i in.mp4 -an -init_hw_device
> vulkan=amd:0 -init_hw_device vulkan=intel:1 -filter_hw_device intel -vf
> 'format=yuv420p,hwupload,hwdownload,format=gray8' -c:v libx264 -frames:v
> 1 out.mp4
> ...
> [hwupload @ 0x5555597bd1c0] Surface format is yuv420p.
> [swscaler @ 0x5555597c2380] deprecated pixel format used, make sure you
> did set range correctly
> [auto_scaler_0 @ 0x5555597c1480] w:1920 h:1080 fmt:gray sar:1/1 -> w:1920
> h:1080 fmt:yuvj444p sar:1/1 flags:0x4
> ffmpeg_g: ../../../src/intel/vulkan/anv_image.c:806:
> anv_layout_to_aux_usage: Assertion `_mesa_bitcount(aspect) == 1 && (aspect
> & image->aspects)' failed.
>
> Thread 1 "ffmpeg_g" received signal SIGABRT, Aborted.
> __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> 51      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x00007ffff3127231 in __GI_abort () at abort.c:79
> #2  0x00007ffff311e9da in __assert_fail_base (fmt=0x7ffff3271d48
> "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion at entry
> =0x7ffff21c72f0 "_mesa_bitcount(aspect) == 1 && (aspect &
> image->aspects)", file=file at entry=0x7ffff21c6d38
> "../../../src/intel/vulkan/anv_image.c", line=line at entry=806,
>     function=function at entry=0x7ffff21c7e60 <__PRETTY_FUNCTION__.66594>
> "anv_layout_to_aux_usage") at assert.c:92
> #3  0x00007ffff311ea52 in __GI___assert_fail (assertion=assertion at entry
> =0x7ffff21c72f0 "_mesa_bitcount(aspect) == 1 && (aspect &
> image->aspects)", file=file at entry=0x7ffff21c6d38
> "../../../src/intel/vulkan/anv_image.c", line=line at entry=806,
>     function=function at entry=0x7ffff21c7e60 <__PRETTY_FUNCTION__.66594>
> "anv_layout_to_aux_usage") at assert.c:101
> #4  0x00007ffff1de9ba8 in anv_layout_to_aux_usage (devinfo=devinfo at entry
> =0x555558b6a8f8, image=image at entry=0x5555597db110, aspect=aspect at entry
> =VK_IMAGE_ASPECT_COLOR_BIT, layout=layout at entry=VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL)
> at ../../../src/intel/vulkan/anv_image.c:806
> #5  0x00007ffff1dd9ed7 in get_blorp_surf_for_anv_image
> (device=0x555558b6a8b0, image=image at entry=0x5555597db110,
> aspect=aspect at entry=1, layout=layout at entry=VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL,
> aux_usage=aux_usage at entry=ISL_AUX_USAGE_NONE, blorp_surf=blorp_surf at entry
> =0x7fffffffce60)
>     at ../../../src/intel/vulkan/anv_blorp.c:204
> #6  0x00007ffff1dda209 in copy_buffer_to_image (cmd_buffer=0x555558aaba70,
> anv_buffer=0x555558670dd0, anv_image=0x5555597db110,
> image_layout=VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, regionCount=<optimized
> out>, pRegions=<optimized out>, buffer_to_image=false) at
> ../../../src/intel/vulkan/anv_blorp.c:382
> #7  0x00007ffff1ddab56 in anv_CmdCopyImageToBuffer
> (commandBuffer=<optimized out>, srcImage=<optimized out>,
> srcImageLayout=<optimized out>, dstBuffer=<optimized out>,
> regionCount=<optimized out>, pRegions=<optimized out>) at
> ../../../src/intel/vulkan/anv_blorp.c:475
> #8  0x00007ffff667887d in vkCmdCopyImageToBuffer
> (commandBuffer=0x555558aaba70, srcImage=0x5555597db110,
> srcImageLayout=VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL,
> dstBuffer=0x555558670dd0, regionCount=1, pRegions=0x7fffffffd060) at
> /home/mrt/video/vulkan/loader/loader/trampoline.c:1923
> #9  0x0000555556a0281d in transfer_image_buf (ctx=0x555558792a40,
> frame=0x5555597c0a00, buffer=0x7fffffffd1f0, stride=0x5555597e4780, w=1920,
> h=1080, pix_fmt=AV_PIX_FMT_GRAY8, to_buf=1) at
> src/libavutil/hwcontext_vulkan.c:1810
> #10 0x0000555556a03158 in vulkan_transfer_data_from (hwfc=0x5555597c0640,
> dst=0x5555597e4740, src=0x5555597bf8c0) at src/libavutil/hwcontext_vulkan
> .c:1963
> #11 0x00005555569f28d5 in av_hwframe_transfer_data (dst=0x5555597e4740,
> src=0x5555597bf8c0, flags=0) at src/libavutil/hwcontext.c:454
> #12 0x000055555579819c in hwdownload_filter_frame (link=0x5555597bdd40,
> input=0x5555597bf8c0) at src/libavfilter/vf_hwdownload.c:153
> #13 0x00005555556af6c9 in ff_filter_frame_framed (link=0x5555597bdd40,
> frame=0x5555597bf8c0) at src/libavfilter/avfilter.c:1071
> #14 0x00005555556aff52 in ff_filter_frame_to_filter (link=0x5555597bdd40)
> at src/libavfilter/avfilter.c:1219
> #15 0x00005555556b014e in ff_filter_activate_default
> (filter=0x5555597bbd00) at src/libavfilter/avfilter.c:1268
> #16 0x00005555556b0372 in ff_filter_activate (filter=0x5555597bbd00) at
> src/libavfilter/avfilter.c:1429
> #17 0x00005555556b5036 in ff_filter_graph_run_once (graph=0x5555597bd900)
> at src/libavfilter/avfiltergraph.c:1454
> #18 0x00005555556b6466 in push_frame (graph=0x5555597bd900) at
> src/libavfilter/buffersrc.c:181
> #19 0x00005555556b6778 in av_buffersrc_add_frame_internal
> (ctx=0x5555597be100, frame=0x555558e10300, flags=4) at
> src/libavfilter/buffersrc.c:255
> #20 0x00005555556b63ed in av_buffersrc_add_frame_flags
> (ctx=0x5555597be100, frame=0x555558e10300, flags=4) at
> src/libavfilter/buffersrc.c:164
> #21 0x0000555555679212 in ifilter_send_frame (ifilter=0x555558b9da00,
> frame=0x555558e10300) at src/fftools/ffmpeg.c:2190
> #22 0x00005555556794f2 in send_frame_to_filters (ist=0x555558d944c0,
> decoded_frame=0x555558e10300) at src/fftools/ffmpeg.c:2264
> #23 0x000055555567a2ac in decode_video (ist=0x555558d944c0,
> pkt=0x7fffffffd820, got_output=0x7fffffffd814, duration_pts=0x7fffffffd818,
> eof=0, decode_failed=0x7fffffffd810) at src/fftools/ffmpeg.c:2465
> #24 0x000055555567ac47 in process_input_packet (ist=0x555558d944c0,
> pkt=0x7fffffffd9e0, no_eof=0) at src/fftools/ffmpeg.c:2619
> #25 0x0000555555681bb5 in process_input (file_index=0) at
> src/fftools/ffmpeg.c:4457
> #26 0x00005555556820c4 in transcode_step () at src/fftools/ffmpeg.c:4577
> #27 0x00005555556821f1 in transcode () at src/fftools/ffmpeg.c:4631
> #28 0x0000555555682a81 in main (argc=20, argv=0x7fffffffe3e8) at
> src/fftools/ffmpeg.c:4838
>
> > +
> > +    pix_fmts = av_malloc((count + 1) * sizeof(*pix_fmts));
> > +    if (!pix_fmts)
> > +        return AVERROR(ENOMEM);
> > +
> > +    count = 0;
> > +    pix_fmts[count++] = hwfc->sw_format;
> > +    if (!(desc->flags & AV_PIX_FMT_FLAG_RGB)) {
> > +        switch (desc->comp[0].depth) {
> > +        case  8: pix_fmts[count++] =  AV_PIX_FMT_GRAY8; break;
> > +        case 10: pix_fmts[count++] = AV_PIX_FMT_GRAY10; break;
> > +        case 12: pix_fmts[count++] = AV_PIX_FMT_GRAY12; break;
> > +        case 16: pix_fmts[count++] = AV_PIX_FMT_GRAY16; break;
> > +        }
>
> Tbh I'm not convinced that offering the luma-only option as well is going
> to cause anything other than confusion.  Do you have any use-cases in mind
> for it?
>

Not that I can think of, but may be useful to someone. I'll leave it in -
its a feature after all and we can expose it.



> > +    }
> > +    pix_fmts[count++] = AV_PIX_FMT_NONE;
> > +
> > +    *formats = pix_fmts;
> > +
> > +    return 0;
> > +}
> > +
> > +typedef struct VulkanMapping {
> > +    AVVkFrame *frame;
> > +    int flags;
> > +} VulkanMapping;
> > +
> > +static void vulkan_unmap_frame(AVHWFramesContext *hwfc,
> HWMapDescriptor *hwmap)
> > +{
> > +    VulkanMapping *map = hwmap->priv;
> > +    AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > +
> > +    /* Check if buffer needs flushing */
> > +    if ((map->flags & AV_HWFRAME_MAP_WRITE) &&
> > +        !(map->frame->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
> > +        VkResult ret;
> > +        VkMappedMemoryRange flush_ranges[AV_NUM_DATA_POINTERS] = { { 0
> } };
> > +
> > +        for (int i = 0; i < map->frame->mem_count; i++) {
> > +            flush_ranges[i].sType  = VK_STRUCTURE_TYPE_MAPPED_MEMOR
> Y_RANGE;
> > +            flush_ranges[i].memory = map->frame->mem[i];
> > +            flush_ranges[i].size   = VK_WHOLE_SIZE;
> > +        }
> > +
> > +        ret = vkFlushMappedMemoryRanges(hwctx->act_dev,
> map->frame->mem_count,
> > +                                        flush_ranges);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(hwfc, AV_LOG_ERROR, "Failed to flush memory: %s\n",
> > +                   vk_ret2str(ret));
> > +        }
> > +    }
> > +
> > +    for (int i = 0; i < map->frame->mem_count; i++)
> > +        vkUnmapMemory(hwctx->act_dev, map->frame->mem[i]);
> > +
> > +    av_free(map);
> > +}
> > +
> > +static int vulkan_map_frame(AVHWFramesContext *hwfc, AVFrame *dst,
> > +                            const AVFrame *src, int flags)
> > +{
> > +    int err;
> > +    VkResult ret;
> > +    AVVkFrame *f = (AVVkFrame *)src->data[0];
> > +    AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > +    const int planes = av_pix_fmt_count_planes(hwfc->sw_format);
> > +
> > +    VulkanMapping *map = av_mallocz(sizeof(VulkanMapping));
> > +    if (!map)
> > +        return AVERROR(EINVAL);
> > +
> > +    if (src->format != AV_PIX_FMT_VULKAN) {
> > +        av_log(hwfc, AV_LOG_ERROR, "Cannot map from pixel format %s!\n",
> > +               av_get_pix_fmt_name(src->format));
> > +        err = AVERROR(EINVAL);
> > +        goto fail;
> > +    }
> > +
> > +    if (!(f->flags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) ||
> > +        !(f->tiling == VK_IMAGE_TILING_LINEAR)) {
> > +        av_log(hwfc, AV_LOG_ERROR, "Unable to map frame, not host
> visible "
> > +               "and linear!\n");
>
> Is this a requirement?  Some devices have magic MMU hardware which can
> linear-map tiled memory.
>

The specifications mention nothing of those, so I guess not. If its
modified to specify what happens we can change it.



> > +        err = AVERROR(EINVAL);
> > +        goto fail;
> > +    }
> > +
> > +    dst->width  = src->width;
> > +    dst->height = src->height;
> > +
> > +    for (int i = 0; i < f->mem_count; i++) {
> > +        ret = vkMapMemory(hwctx->act_dev, f->mem[i], 0,
> > +                          VK_WHOLE_SIZE, 0, (void **)&dst->data[i]);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(hwfc, AV_LOG_ERROR, "Failed to map image memory:
> %s\n",
> > +                vk_ret2str(ret));
> > +            err = AVERROR_EXTERNAL;
> > +            goto fail;
> > +        }
> > +    }
> > +
> > +    /* For non disjoint memory duplicate them */
> > +    if (f->mem_count == 1)
> > +        for (int i = 1; i < planes; i++)
> > +            dst->data[i] = dst->data[0];
> > +
> > +    /* Check if the memory contents matter */
> > +    if (((flags & AV_HWFRAME_MAP_READ) || !(flags &
> AV_HWFRAME_MAP_OVERWRITE)) &&
> > +        !(f->flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)) {
> > +        VkMappedMemoryRange map_mem_ranges[AV_NUM_DATA_POINTERS] = { {
> 0 } };
> > +        for (int i = 0; i < f->mem_count; i++) {
> > +            map_mem_ranges[i].sType = VK_STRUCTURE_TYPE_MAPPED_MEMOR
> Y_RANGE;
> > +            map_mem_ranges[i].size = VK_WHOLE_SIZE;
> > +            map_mem_ranges[i].memory = f->mem[i];
> > +        }
> > +
> > +        ret = vkInvalidateMappedMemoryRanges(hwctx->act_dev,
> f->mem_count,
> > +                                             map_mem_ranges);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(hwfc, AV_LOG_ERROR, "Failed to invalidate memory:
> %s\n",
> > +                   vk_ret2str(ret));
> > +            err = AVERROR_EXTERNAL;
> > +            goto fail;
> > +        }
> > +    }
> > +
> > +    for (int i = 0; i < planes; i++) {
> > +        VkImageSubresource sub = {
> > +            .aspectMask = planes < 2 ? VK_IMAGE_ASPECT_COLOR_BIT :
> > +                              i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > +                              i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > +                                       VK_IMAGE_ASPECT_PLANE_2_BIT,
> > +        };
> > +        VkSubresourceLayout layout;
> > +        vkGetImageSubresourceLayout(hwctx->act_dev, f->img, &sub,
> &layout);
> > +        dst->data[i]    += layout.offset;
> > +        dst->linesize[i] = layout.rowPitch;
> > +    }
> > +
> > +    map->frame = f;
> > +    map->flags = flags;
> > +
> > +    err = ff_hwframe_map_create(src->hw_frames_ctx, dst, src,
> > +                                &vulkan_unmap_frame, map);
> > +    if (err < 0)
> > +        goto fail;
> > +
> > +    return 0;
> > +
> > +fail:
> > +    for (int i = 0; i < f->mem_count; i++)
> > +        vkUnmapMemory(hwctx->act_dev, f->mem[i]);
>
> Unmap isn't valid on memory which isn't currently mapped; this needs to
> track how many have actually been mapped.
>

Done.



> > +
> > +    av_free(map);
> > +    return err;
> > +}
> > +
> > +#if CONFIG_LIBDRM
> > +static void vulkan_unmap_from(AVHWFramesContext *hwfc, HWMapDescriptor
> *hwmap)
> > +{
> > +    VulkanMapping *map = hwmap->priv;
> > +    AVVulkanDeviceContext *hwctx = hwfc->device_ctx->hwctx;
> > +
> > +    vkDestroyImage(hwctx->act_dev, map->frame->img, hwctx->alloc);
> > +    for (int i = 0; i < map->frame->mem_count; i++)
> > +        vkFreeMemory(hwctx->act_dev, map->frame->mem[i], hwctx->alloc);
> > +
> > +    av_freep(&map->frame);
> > +}
> > +
> > +static int vulkan_map_from_drm_frame_desc(AVHWFramesContext *hwfc,
> AVVkFrame **f,
> > +                                          AVDRMFrameDescriptor *desc)
> > +{
> > +    int err = 0;
> > +
> > +    /* Destination frame */
> > +#if HAVE_VULKAN_DRM_MOD
> > +    uint64_t modifier_buf[AV_NUM_DATA_POINTERS];
> > +    VkImageDrmFormatModifierListCreateInfoEXT drm_mod = {
> > +        .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEV
> ICE_IMAGE_DRM_FORMAT_MODIFIER_INFO_EXT,
> > +    };
> > +#endif
> > +    VkExternalMemoryImageCreateInfo ext_info = {
> > +        .sType       = VK_STRUCTURE_TYPE_EXTERNAL_MEM
> ORY_IMAGE_CREATE_INFO,
> > +#if HAVE_VULKAN_DRM_MOD
> > +        .pNext       = &drm_mod,
> > +#endif
> > +        .handleTypes = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT,
> > +    };
> > +    VkImportMemoryFdInfoKHR import_desc[AV_NUM_DATA_POINTERS];
> > +
> > +    if ((desc->nb_objects > 1) &&
> > +        (desc->nb_objects != av_pix_fmt_count_planes(hwfc->format))) {
>
> "hwfc->sw_format"
>

Fixed.



> > +        av_log(hwfc, AV_LOG_ERROR, "Number of DRM objects doesn't match
> "
> > +               "plane count!\n");
> > +        return AVERROR(EINVAL);
> > +    }
> > +
> > +    for (int i = 0; i < desc->nb_objects; i++) {
> > +        import_desc[i].sType      = VK_STRUCTURE_TYPE_IMPORT_MEMOR
> Y_FD_INFO_KHR;
> > +        import_desc[i].pNext      = NULL;
> > +        import_desc[i].handleType = ext_info.handleTypes;
> > +        import_desc[i].fd         = desc->objects[i].fd;
> > +#if HAVE_VULKAN_DRM_MOD
> > +        modifier_buf[i]           = desc->objects[i].format_modifier;
>
> I think you want to give it the modifier structure only if the modifier
> isn't DRM_FORMAT_MOD_INVALID.  Not passing a modifier asks the driver to
> use internal magic if it can (e.g. dri_bo_get_tiling()), passing a modifier
> must alway use what you give it.
>

Done (I also check if DRM_FORMAT_MOD_INVALID exists and if not define it,
like in hwcontext_vaapi.c).



> > +#endif
> > +    }
> > +#if HAVE_VULKAN_DRM_MOD
> > +    drm_mod.pDrmFormatModifiers    = modifier_buf;
> > +    drm_mod.drmFormatModifierCount = desc->nb_objects;
> > +#endif
> > +
> > +    err = create_frame(hwfc, f,
> > +#if HAVE_VULKAN_DRM_MOD
> > +                       VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT,
> > +#else
> > +                       desc->objects[0].format_modifier ==
> DRM_FORMAT_MOD_LINEAR ?
> > +                       VK_IMAGE_TILING_LINEAR : VK_IMAGE_TILING_OPTIMAL,
> > +#endif
> > +                       DEFAULT_USAGE_FLAGS, desc->nb_objects > 1,
> &ext_info,
> > +                       import_desc, sizeof(*import_desc));
> > +    if (err < 0)
> > +        return err;
>
> You do need to look at the layer information in the DRM descriptor here.
>

> E.g. mapping from VAAPI/NV12 works on Intel, but on AMD I get (with the
> above plane-count bug fixed):
>

It should error out that the mapping failed. It's a driver bug.
I'd also really rather not check the layers struct, there's just too many
ways you can represent pixfmts.



> $ gdb --args ./ffmpeg_g -y -hwaccel vaapi -hwaccel_output_format vaapi
> -hwaccel_device /dev/dri/renderD129 -i in.mp4 -an -vf
> 'hwmap=derive_device=vulkan,scale_vulkan=1280:720,hwmap=derive_device=vaapi:reverse=1'
> -c:v h264_vaapi -frames:v 1 out.mp4
> ...
> [Parsed_hwmap_2 @ 0x555558f3ff40] Configure hwmap vulkan -> vaapi_vld.
> [AVHWFramesContext @ 0x55555952be00] Created surface 0x24.
> [AVHWFramesContext @ 0x55555952be00] Direct mapping disabled: deriving
> image does not work: 6 (invalid VASurfaceID).
> [hwmap @ 0x5555584578c0] Filter input: vaapi_vld, 1920x1080 (2000).
>
> Thread 1 "ffmpeg_g" received signal SIGSEGV, Segmentation fault.
> vk_format_get_nr_components (format=<optimized out>) at
> ../../../../src/amd/vulkan/vk_format.h:535
> 535             return desc->nr_channels;
> (gdb) bt
> #0  vk_format_get_nr_components (format=<optimized out>) at
> ../../../../src/amd/vulkan/vk_format.h:535
> #1  radv_image_create (_device=0x5555593e26a0,
> create_info=create_info at entry=0x7fffffffcf00, alloc=<optimized out>,
> pImage=<optimized out>) at ../../../../src/amd/vulkan/radv_image.c:943
> #2  0x00007fffac3ed92e in radv_CreateImage (device=<optimized out>,
> pCreateInfo=<optimized out>, pAllocator=<optimized out>, pImage=<optimized
> out>) at ../../../../src/amd/vulkan/radv_image.c:1261
> #3  0x00007ffff66776c4 in vkCreateImage (device=0x5555593e26a0,
> pCreateInfo=0x7fffffffcf90, pAllocator=0x0, pImage=0x55555946f300) at
> /home/mrt/video/vulkan/loader/loader/trampoline.c:1328
> #4  0x0000555556a00a07 in create_frame (hwfc=0x555559568c80,
> frame=0x7fffffffd1c8, tiling=VK_IMAGE_TILING_LINEAR, usage=15, disjoint=1,
> create_pnext=0x7fffffffd160, alloc_pnext=0x7fffffffd0a0,
> alloc_pnext_stride=24) at src/libavutil/hwcontext_vulkan.c:1130
> #5  0x0000555556a019db in vulkan_map_from_drm_frame_desc
> (hwfc=0x555559568c80, f=0x7fffffffd1c8, desc=0x5555595abd80) at
> src/libavutil/hwcontext_vulkan.c:1480
> #6  0x0000555556a01a38 in vulkan_map_from_drm (hwfc=0x555559568c80,
> dst=0x555559322000, src=0x555559322300, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1502
> #7  0x0000555556a01b90 in vulkan_map_from_vaapi (dst_fc=0x555559568c80,
> dst=0x555559322000, src=0x555559321d40, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1550
> #8  0x0000555556a01c67 in vulkan_map_to (hwfc=0x555559568c80,
> dst=0x555559322000, src=0x555559321d40, flags=3) at
> src/libavutil/hwcontext_vulkan.c:1579
> #9  0x00005555569f3307 in av_hwframe_map (dst=0x555559322000,
> src=0x555559321d40, flags=3) at src/libavutil/hwcontext.c:792
> #10 0x0000555555798c69 in hwmap_filter_frame (link=0x555558f41180,
> input=0x555559321d40) at src/libavfilter/vf_hwmap.c:339
> #11 0x00005555556af6c9 in ff_filter_frame_framed (link=0x555558f41180,
> frame=0x555559321d40) at src/libavfilter/avfilter.c:1071
> #12 0x00005555556aff52 in ff_filter_frame_to_filter (link=0x555558f41180)
> at src/libavfilter/avfilter.c:1219
> #13 0x00005555556b014e in ff_filter_activate_default
> (filter=0x5555582649c0) at src/libavfilter/avfilter.c:1268
> #14 0x00005555556b0372 in ff_filter_activate (filter=0x5555582649c0) at
> src/libavfilter/avfilter.c:1429
> #15 0x00005555556b5036 in ff_filter_graph_run_once (graph=0x555558f3e500)
> at src/libavfilter/avfiltergraph.c:1454
> #16 0x00005555556b6466 in push_frame (graph=0x555558f3e500) at
> src/libavfilter/buffersrc.c:181
> #17 0x00005555556b6778 in av_buffersrc_add_frame_internal
> (ctx=0x555558f3e600, frame=0x555558592c40, flags=4) at
> src/libavfilter/buffersrc.c:255
> #18 0x00005555556b63ed in av_buffersrc_add_frame_flags
> (ctx=0x555558f3e600, frame=0x555558592c40, flags=4) at
> src/libavfilter/buffersrc.c:164
> #19 0x0000555555679212 in ifilter_send_frame (ifilter=0x5555581f1b40,
> frame=0x555558592c40) at src/fftools/ffmpeg.c:2190
> #20 0x00005555556794f2 in send_frame_to_filters (ist=0x555558352240,
> decoded_frame=0x555558592c40) at src/fftools/ffmpeg.c:2264
> #21 0x000055555567a2ac in decode_video (ist=0x555558352240,
> pkt=0x7fffffffd820, got_output=0x7fffffffd814, duration_pts=0x7fffffffd818,
> eof=0, decode_failed=0x7fffffffd810) at src/fftools/ffmpeg.c:2465
> #22 0x000055555567ac47 in process_input_packet (ist=0x555558352240,
> pkt=0x7fffffffd9e0, no_eof=0) at src/fftools/ffmpeg.c:2619
> #23 0x0000555555681bb5 in process_input (file_index=0) at
> src/fftools/ffmpeg.c:4457
> #24 0x00005555556820c4 in transcode_step () at src/fftools/ffmpeg.c:4577
> #25 0x00005555556821f1 in transcode () at src/fftools/ffmpeg.c:4631
> #26 0x0000555555682a81 in main (argc=20, argv=0x7fffffffe3e8) at
> src/fftools/ffmpeg.c:4838
>
> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +typedef struct ImageBuffer {
> > +    VkBuffer buf;
> > +    VkDeviceMemory mem;
> > +    VkMemoryPropertyFlagBits flags;
> > +} ImageBuffer;
> > +
> > +static int create_buf(AVHWDeviceContext *ctx, ImageBuffer *buf, size_t
> size,
> > +                      VkBufferUsageFlags usage,
> VkMemoryPropertyFlagBits flags,
> > +                      void *create_pnext, void *alloc_pnext)
> > +{
> > +    int err;
> > +    VkResult ret;
> > +    VkMemoryRequirements req;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +
> > +    VkBufferCreateInfo buf_spawn = {
> > +        .sType       = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
> > +        .pNext       = create_pnext,
> > +        .usage       = usage,
> > +        .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
> > +        .size        = size, /* Gets FFALIGNED during alloc if host
> visible
> > +                                but should be ok */
> > +    };
> > +
> > +    ret = vkCreateBuffer(hwctx->act_dev, &buf_spawn, NULL, &buf->buf);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed to create buffer: %s\n",
> > +               vk_ret2str(ret));
> > +        return AVERROR_EXTERNAL;
> > +    }
> > +
> > +    vkGetBufferMemoryRequirements(hwctx->act_dev, buf->buf, &req);
> > +
> > +    err = alloc_mem(ctx, &req, flags, alloc_pnext, &buf->flags,
> &buf->mem);
> > +    if (err)
> > +        return err;
> > +
> > +    ret = vkBindBufferMemory(hwctx->act_dev, buf->buf, buf->mem, 0);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Failed to bind memory to buffer:
> %s\n",
> > +               vk_ret2str(ret));
> > +        return AVERROR_EXTERNAL;
> > +    }
>
> This function looks like it is missing the free cases on failure.
>

Fixed.



> > +
> > +    return 0;
> > +}
> > +
> > +static void free_buf(AVHWDeviceContext *ctx, ImageBuffer *buf)
> > +{
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    if (!buf)
> > +        return;
> > +
> > +    vkDestroyBuffer(hwctx->act_dev, buf->buf, hwctx->alloc);
> > +    vkFreeMemory(hwctx->act_dev, buf->mem, hwctx->alloc);
> > +}
> > +
> > +static int map_buffers(AVHWDeviceContext *ctx, ImageBuffer *buf,
> uint8_t *mem[],
> > +                       int nb_buffers, int invalidate)
> > +{
> > +    VkResult ret;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VkMappedMemoryRange invalidate_ctx[AV_NUM_DATA_POINTERS];
> > +    int invalidate_count = 0;
> > +
> > +    for (int i = 0; i < nb_buffers; i++) {
> > +        ret = vkMapMemory(hwctx->act_dev, buf[i].mem, 0,
> > +                          VK_WHOLE_SIZE, 0, (void **)&mem[i]);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(ctx, AV_LOG_ERROR, "Failed to map buffer memory:
> %s\n",
> > +                   vk_ret2str(ret));
> > +            return AVERROR_EXTERNAL;
> > +        }
> > +    }
> > +
> > +    if (!invalidate)
> > +        return 0;
> > +
> > +    for (int i = 0; i < nb_buffers; i++) {
> > +        const VkMappedMemoryRange ival_buf = {
> > +            .sType  = VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
> > +            .memory = buf[i].mem,
> > +            .size   = VK_WHOLE_SIZE,
> > +        };
> > +        if (buf[i].flags & VK_MEMORY_PROPERTY_HOST_COHERENT_BIT)
> > +            continue;
> > +        invalidate_ctx[invalidate_count++] = ival_buf;
> > +    }
> > +
> > +    if (invalidate_count) {
> > +        ret = vkInvalidateMappedMemoryRanges(hwctx->act_dev,
> invalidate_count,
> > +                                             invalidate_ctx);
> > +        if (ret != VK_SUCCESS) {
> > +            av_log(ctx, AV_LOG_ERROR, "Failed to invalidate memory:
> %s\n",
> > +                    vk_ret2str(ret));
> > +            return AVERROR_EXTERNAL;
> > +        }
> > +    }
>
> Missing unmap cases?
>

I'd rather not error out here, I've changed this to a warning and made the
function return 0.



> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +static int transfer_image_buf(AVHWDeviceContext *ctx, AVVkFrame *frame,
> > +                              ImageBuffer *buffer, const int *stride,
> int w,
> > +                              int h, enum AVPixelFormat pix_fmt, int
> to_buf)
> > +{
> > +    VkResult ret;
> > +    AVVulkanDeviceContext *hwctx = ctx->hwctx;
> > +    VulkanDevicePriv *s = ctx->internal->priv;
> > +
> > +    const int planes = av_pix_fmt_count_planes(pix_fmt);
> > +    const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
> > +
> > +    VkCommandBufferBeginInfo cmd_start = {
> > +        .sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
> > +        .flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
> > +    };
> > +
> > +    VkSubmitInfo s_info = {
> > +        .sType                = VK_STRUCTURE_TYPE_SUBMIT_INFO,
> > +        .commandBufferCount   = 1,
> > +        .pCommandBuffers      = &s->cmd_buf,
> > +    };
> > +
> > +    vkBeginCommandBuffer(s->cmd_buf, &cmd_start);
>
> Return value needs to be checked.
>

Done.



> > +
> > +    { /* Change the image layout to something more optimal for
> transfers */
> > +        VkImageMemoryBarrier bar = {
> > +            .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
> > +            .srcAccessMask = 0,
> > +            .dstAccessMask = to_buf ? VK_ACCESS_TRANSFER_READ_BIT :
> > +                                      VK_ACCESS_TRANSFER_WRITE_BIT,
> > +            .oldLayout = frame->layout,
> > +            .newLayout = to_buf ? VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL
> :
> > +                                  VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
> > +            .srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
> > +            .dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED,
> > +            .image = frame->img,
> > +            .subresourceRange.levelCount = 1,
> > +            .subresourceRange.layerCount = 1,
> > +        };
> > +
> > +        if (planes == 1) {
> > +            bar.subresourceRange.aspectMask  =
> VK_IMAGE_ASPECT_COLOR_BIT;
> > +        } else {
> > +            bar.subresourceRange.aspectMask  =
> VK_IMAGE_ASPECT_PLANE_0_BIT;
> > +            bar.subresourceRange.aspectMask |=
> VK_IMAGE_ASPECT_PLANE_1_BIT;
> > +            if (planes > 2)
> > +                bar.subresourceRange.aspectMask |=
> VK_IMAGE_ASPECT_PLANE_2_BIT;
> > +        }
> > +
> > +        vkCmdPipelineBarrier(s->cmd_buf, VK_PIPELINE_STAGE_TOP_OF_PIPE_
> BIT,
> > +                             VK_PIPELINE_STAGE_TRANSFER_BIT,
> > +                             0, 0, NULL, 0, NULL, 1, &bar);
> > +
> > +        /* Update to the new layout */
> > +        frame->layout = bar.newLayout;
> > +        frame->access = bar.dstAccessMask;
> > +    }
> > +
> > +    /* Schedule a copy for each plane */
> > +    for (int i = 0; i < planes; i++) {
> > +        VkImageSubresourceLayers sub = {
> > +            .aspectMask = planes < 2 ? VK_IMAGE_ASPECT_COLOR_BIT :
> > +                              i == 0 ? VK_IMAGE_ASPECT_PLANE_0_BIT :
> > +                              i == 1 ? VK_IMAGE_ASPECT_PLANE_1_BIT :
> > +                                       VK_IMAGE_ASPECT_PLANE_2_BIT,
> > +            .layerCount = 1,
> > +        };
> > +        const int p_w = i > 0 ? AV_CEIL_RSHIFT(w, desc->log2_chroma_w)
> : w;
> > +        const int p_h = i > 0 ? AV_CEIL_RSHIFT(h, desc->log2_chroma_h)
> : h;
> > +        VkBufferImageCopy buf_reg = {
> > +            .bufferOffset = 0,
> > +            /* Buffer stride isn't in bytes, it's in samples, the
> implementation
> > +             * uses the image's VkFormat to know how many bytes per
> sample
> > +             * the buffer has. So we have to convert by dividing.
> Stupid. */
> > +            .bufferRowLength = stride[i] / desc->comp[i].step,
>
> comp[i] isn't necessarily plane[i], but I think it happens to work anyway
> for all of the supported formats.
>
> More generally, this tricky code using some properties like the
> log2_chroma values will fail for formats like YUVA420P, so if you are
> intending to add them later (for overlay) it might be sensible to ensure
> that this doesn't make too many assumptions now.
>

I'd rather change this later. I don't think alpha planes will be supported,
for overlay with transparency its better to use RGBA.
I've added a note to say it won't work with planar alpha formats.



> > +            .bufferImageHeight = p_h,
> > +            .imageSubresource = sub,
> > +            .imageOffset = { 0 },
> > +            .imageExtent = { p_w, p_h, 1, },
> > +        };
> > +        if (to_buf)
> > +            vkCmdCopyImageToBuffer(s->cmd_buf, frame->img,
> frame->layout,
> > +                                   buffer[i].buf, 1, &buf_reg);
> > +        else
> > +            vkCmdCopyBufferToImage(s->cmd_buf, buffer[i].buf,
> frame->img,
> > +                                   frame->layout, 1, &buf_reg);
> > +    }
> > +
> > +    vkEndCommandBuffer(s->cmd_buf);
>
> Can also fail.
>

Fixed.



>
> > +
> > +    ret = vkQueueSubmit(s->cmd_queue, 1, &s_info, s->cmd_fence);
> > +    if (ret != VK_SUCCESS) {
> > +        av_log(ctx, AV_LOG_ERROR, "Unable to submit command buffer:
> %s\n",
> > +               vk_ret2str(ret));
> > +        return AVERROR_EXTERNAL;
> > +    } else {
> > +        vkWaitForFences(hwctx->act_dev, 1, &s->cmd_fence, VK_TRUE,
> UINT64_MAX);
> > +        vkResetFences(hwctx->act_dev, 1, &s->cmd_fence);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > ...
> > +
> > +const HWContextType ff_hwcontext_type_vulkan = {
> > +    .type                   = AV_HWDEVICE_TYPE_VULKAN,
> > +    .name                   = "Vulkan",
> > +
> > +    .device_hwctx_size      = sizeof(AVVulkanDeviceContext),
> > +    .device_priv_size       = sizeof(VulkanDevicePriv),
> > +    .frames_hwctx_size      = sizeof(AVVulkanFramesContext),
> > +
> > +    .device_init            = &vulkan_device_init,
> > +    .device_create          = &vulkan_device_create,
> > +    .device_derive          = &vulkan_device_derive,
> > +
> > +    .frames_get_constraints = &vulkan_frames_get_constraints,
> > +    .frames_init            = vulkan_frames_init,
> > +    .frames_get_buffer      = vulkan_get_buffer,
> > +
> > +    .transfer_get_formats   = vulkan_transfer_get_formats,
> > +    .transfer_data_to       = vulkan_transfer_data_to,
> > +    .transfer_data_from     = vulkan_transfer_data_from,
> > +
> > +    .map_to                 = vulkan_map_to,
> > +
> > +    .pix_fmts = (const enum AVPixelFormat[]) {
> > +        AV_PIX_FMT_VULKAN,
> > +        AV_PIX_FMT_NONE
> > +    },
> > +};
> > diff --git a/libavutil/hwcontext_vulkan.h b/libavutil/hwcontext_vulkan.h
> > new file mode 100644
> > index 0000000000..342c833a23
> > --- /dev/null
> > +++ b/libavutil/hwcontext_vulkan.h
> > @@ -0,0 +1,133 @@
> > +/*
> > + * Vulkan hwcontext
> > + * Copyright (c) 2018 Rostislav Pehlivanov <atomnuker at gmail.com>
> > + *
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> 02110-1301 USA
> > + */
> > +
> > +#ifndef AVUTIL_HWCONTEXT_VULKAN_H
> > +#define AVUTIL_HWCONTEXT_VULKAN_H
> > +
> > +#include <vulkan/vulkan.h>
> > +
> > +/**
> > + * @file
> > + * API-specific header for AV_HWDEVICE_TYPE_VULKAN.
> > + *
> > + * For user-allocated pools, AVHWFramesContext.pool must return
> AVBufferRefs
> > + * with the data pointer set to an AVVkFrame.
> > + */
> > +
> > +/**
> > + * Main Vulkan context, allocated as AVHWDeviceContext.hwctx.
> > + * All of these can be set before init to change what the context uses
> > + */
> > +typedef struct AVVulkanDeviceContext {
> > +    /**
> > +     * Custom memory allocator, else NULL
> > +     */
> > +    const VkAllocationCallbacks *alloc;
> > +    /**
> > +     * Instance
> > +     */
> > +    VkInstance inst;
> > +    /**
> > +     * Physical device
> > +     */
> > +    VkPhysicalDevice phys_dev;
> > +    /**
> > +     * Activated physical device
> > +     */
> > +    VkDevice act_dev;
> > +    /**
> > +     * Queue family index for graphics
> > +     */
> > +    int queue_family_index;
> > +    /**
> > +     * Queue family index for transfer ops only. By default, the
> priority order
> > +     * is dedicated transfer > dedicated compute > graphics.
> > +     */
> > +    int queue_family_tx_index;
> > +    /**
> > +     * Queue family index for compute ops. Will be equal to the graphics
> > +     * one unless a dedicated transfer queue is found.
> > +     */
> > +    int queue_family_comp_index;
> > +} AVVulkanDeviceContext;
> > +
> > +/**
> > + * Allocated as AVHWFramesContext.hwctx, used to set pool-specific
> options
> > + */
> > +typedef struct AVVulkanFramesContext {
> > +    /**
> > +     * Controls the tiling of output frames.
> > +     */
> > +    VkImageTiling tiling;
> > +    /**
> > +     * Defines extra usage of output frames. This is bitwise OR'd with
> the
> > +     * standard usage flags (SAMPLED, STORAGE, TRANSFER_SRC and
> TRANSFER_DST).
> > +     */
> > +    VkImageUsageFlagBits usage;
> > +    /**
> > +     * Set to 1 to allocate all planes separately (disjoint images)
> > +     */
> > +    int disjoint;
> > +    /**
> > +     * Extension data for image creation. By default, if the extension
> is
> > +     * available, this will be chained to a
> VkImageFormatListCreateInfoKHR.
> > +     */
> > +    void *create_pnext;
> > +    /**
> > +     * Extension data for memory allocation. If the image is disjoint,
> this
> > +     * must be one per plane, otherwise just the first entry is used.
> > +     * This will be chained to VkExportMemoryAllocateInfo, which is used
> > +     * to make all pool images exportable to other APIs.
> > +     */
> > +    void *alloc_pnext[AV_NUM_DATA_POINTERS];
> > +} AVVulkanFramesContext;
> > +
> > +/*
> > + * Frame structure, the VkFormat of the image will always match
> > + * the pool's sw_format.
> > + */
> > +typedef struct AVVkFrame {
> > +    VkImage img;
> > +    VkImageTiling tiling;
> > +    /**
> > +     * Always 1 for non-disjoint images, #planes for disjoint
> > +     */
> > +    int mem_count;
> > +    VkDeviceMemory mem[AV_NUM_DATA_POINTERS];
> > +    /**
> > +     * OR'd flags for all memory allocated
> > +     */
> > +    VkMemoryPropertyFlagBits flags;
> > +
> > +    /**
> > +     * Updated after every barrier
> > +     */
> > +    VkAccessFlagBits access;
> > +    VkImageLayout layout;
> > +} AVVkFrame;
>
> This all looks much cleaner than the previous version.
>
> > +/**
> > + * Converts AVPixelFormat to VkFormat, returns VK_FORMAT_UNDEFINED if
> unsupported
> > + * by the hwcontext
> > + */
> > +VkFormat av_vkfmt_from_pixfmt(enum AVPixelFormat p);
> > +
> > +#endif /* AVUTIL_HWCONTEXT_VULKAN_H */
> > diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> > index ff5c20d50e..c3b3aaee65 100644
> > --- a/libavutil/pixdesc.c
> > +++ b/libavutil/pixdesc.c
> > @@ -1673,6 +1673,10 @@ static const AVPixFmtDescriptor
> av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
> >          .name = "videotoolbox_vld",
> >          .flags = AV_PIX_FMT_FLAG_HWACCEL,
> >      },
> > +    [AV_PIX_FMT_VULKAN] = {
> > +        .name = "vulkan",
> > +        .flags = AV_PIX_FMT_FLAG_HWACCEL,
> > +    },
>
> You've put this in a funny place in the middle?
>

I thought it was in alphabet order, fixed, its now just after OPENCL.


> >      [AV_PIX_FMT_GBRP] = {
> >          .name = "gbrp",
> >          .nb_components = 3,
> > diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
> > index aea008bbdc..e6991f3630 100644
> > --- a/libavutil/pixfmt.h
> > +++ b/libavutil/pixfmt.h
> > @@ -333,6 +333,10 @@ enum AVPixelFormat {
> >      AV_PIX_FMT_GRAY14BE,   ///<        Y        , 14bpp, big-endian
> >      AV_PIX_FMT_GRAY14LE,   ///<        Y        , 14bpp, little-endian
> >
> > +    /* Vulkan hardware images,
> > +     * data[0] contain an AVVkFrame */
> > +    AV_PIX_FMT_VULKAN,
> > +
> >      AV_PIX_FMT_NB         ///< number of pixel formats, DO NOT USE THIS
> if you want to link with shared libav* because the number of formats might
> differ between versions
> >  };
> >
> > diff --git a/libavutil/version.h b/libavutil/version.h
> > index 44bdebdc93..84409b1d69 100644
> > --- a/libavutil/version.h
> > +++ b/libavutil/version.h
> > @@ -79,8 +79,8 @@
> >   */
> >
> >  #define LIBAVUTIL_VERSION_MAJOR  56
> > -#define LIBAVUTIL_VERSION_MINOR  18
> > -#define LIBAVUTIL_VERSION_MICRO 102
> > +#define LIBAVUTIL_VERSION_MINOR  19
> > +#define LIBAVUTIL_VERSION_MICRO 100
> >
> >  #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR,
> \
> >                                                 LIBAVUTIL_VERSION_MINOR,
> \
> >
>
> I think I would make the pixfmt addition a patch on its own just to keep
> it separate, but that probably doesn't matter very much.
>

I'll keep it then.



> - Mark
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


I've attached a new patch, thanks for the review.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-lavu-add-a-Vulkan-hwcontext.patch
Type: text/x-patch
Size: 93227 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180531/b821932c/attachment.bin>


More information about the ffmpeg-devel mailing list