[Libav-user] Reduce latency in av_hwframe_transfer_data

Michael Goffioul michael.goffioul at gmail.com
Sat Oct 8 15:31:15 EEST 2022


I've done some time measurements to try to identify possible bottlenecks.

The av_hwframe_transfer_data implementation basically consists of 2
steps: vaapi_transfer_data_from, then av_copy_frame. On average, about 2/3
to 3/4 of the time is spent in vaapi_transfer_data_from.

Then vaapi_transfer_data_from consists of about4 steps:
vaSyncSurface, vaCreateImage, vaGetImage and vaMapBuffer. The average times
spent on each steps are as follows, when decoding UHD stream (
https://storage.googleapis.com/wvmedia/clear/h264/tears/tears_uhd.mpd') on
my Baytrail platform:

vaSyncSurface: 7ms
vaCreateImage: 0.05ms
vaGetImage: 4ms
vaMapBuffer: 14ms

These are average numbers and some steps have large deviations. But one
that is consistently long is vaMapBuffer.

I've then done some timing inside intel-vaapi-driver for the buffer mapping
step. On my platform, this mapping basically consists of i965 driver
calling dri_bo_map, which translates into drm_intel_gem_bo_map (in
libdrm_intel). And within that function, most of the time (like 99%) is
spent on drmIoctl(DRM_IOCTL_I915_GEM_SET_DOMAIN).

The vaSyncSurface also introduces quite a significant latency. After some
Googling, I found this reference [1], which states that on Intel platform
vaSyncSurface is unnecessary if it's followed by a vaMapBuffer. However
this is in the context of encoding, so this might not really apply to me
use case. Nevertheless, I've tried to remove the vaSyncSurface step from
FFMPEG, and it did decrease the overall latency, without any visible effect
on screen. Although it doesn't feel like the right to do.

What I've also tried is to map the hwframe directly with
AV_PIX_FMT_DRM_PRIME, which is fast. But the content is Y-tiled, so that's
not really helping if I need to do the detiling in software.

I'm not entirely sure where to go from there, I would appreciate some
feedback or hints from the devs. I'm also not sure whether this is the
right place for that kind of things, let me know if there's a better place
for that type of support.

Thanks,
Michael.

[1]
https://chromium.googlesource.com/chromium/src/+/main/media/gpu/vaapi/vaapi_wrapper.cc#2828

On Thu, Oct 6, 2022 at 10:20 PM Michael Goffioul <michael.goffioul at gmail.com>
wrote:

> HI,
>
> I'm using FFMPEG (5.1.2) on an Android (x86) platform running on an Intel
> Baytrail chipset. Video decoding uses VA-API for mpeg2 and h264. The
> implementation follows the same pattern as the hw_decode.c example.
>
> Everything works fine, but when timing the various parts of the decoding
> loop (send packet, receive frame, hw-transfer, scale/pixfmt conversion), it
> appears that most of the time is spent in av_hwframe_transfer_data. E.g.
> these are typical numbers for various resolution:
> - 1280x720: ~10ms
> - 1920x1080: ~20ms
> - 3840x1714: ~40ms
>
> My question is then whether there's anything I do or look into in order to
> reduce that latency? Is it at all possible, or is it just the way it is and
> I can't squeeze more performance out of that platform?
>
> Not sure whether it's helpful, but here's some VA-API related log output
> from FFMPEG:
> 10-06 22:18:00.162  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Opened VA display via Android device android.
> 10-06 22:18:00.162  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] libva: VA-API version 1.4.0
> 10-06 22:18:00.162  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] libva: va_getDriverName() returns 0
> 10-06 22:18:00.162  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] libva: Trying to open /vendor/lib64/dri/i965_drv_video.so
> 10-06 22:18:00.163  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] libva: Found init function __vaDriverInit_1_4
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] libva: va_openDriver() returns 0
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Initialised VAAPI connection: version 1.4
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x32315659 -> yuv420p.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x30323449 -> yuv420p.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x3231564e -> nv12.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x32595559 -> yuyv422.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x59565955 -> uyvy422.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x48323234 -> yuv422p.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x58424752 -> rgb0.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x58524742 -> bgr0.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Format 0x30313050 -> p010le.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] VAAPI driver: Intel i965 driver for Intel(R) Bay Trail -
> 2.4.0.pre1 (a9d2c1f).
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [AVHWDeviceContext @
> 0x760040312900] Driver not found in known nonstandard list, using standard
> behaviour.
> 10-06 22:18:00.165  3805  3853 I HWACCEL : hw codec h264 enabled:
> s=0x7600e0316700 pix_fmts=44
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [h264 @ 0x7600e0316700] Format
> vaapi chosen by get_format().
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [h264 @ 0x7600e0316700] Format
> vaapi requires hwaccel initialisation.
> 10-06 22:18:00.165  3805  3853 I FFMPEG  : [h264 @ 0x7600e0316700]
> Considering format 0x3231564e -> nv12.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [h264 @ 0x7600e0316700] Picked
> nv12 (0x3231564e) as best match for yuv420p.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000000.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Direct mapping disabled: derived image format 3231564e does
> not match expected format 32315659.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000001.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000002.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000003.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000004.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000005.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000006.
> 10-06 22:18:00.166  3805  3853 I FFMPEG  : [AVHWFramesContext @
> 0x760060312bc0] Created surface 0x4000007.
>
> Thanks,
> Michael.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ffmpeg.org/pipermail/libav-user/attachments/20221008/b7bfbd90/attachment.htm>


More information about the Libav-user mailing list