[FFmpeg-devel] [PATCH] libavfilter OpenCL unsharpen optimization

Michael Niedermayer michaelni at gmx.at
Tue Feb 10 02:39:52 CET 2015


On Mon, Feb 09, 2015 at 03:41:24PM -0800, Alexey Titov wrote:
> ---
>  libavfilter/unsharp.h               |   4 ++
>  libavfilter/unsharp_opencl.c        |  81 +++++++++++++++++-------
>  libavfilter/unsharp_opencl_kernel.h | 122 ++++++++++++++++++++++++++----------
>  3 files changed, 150 insertions(+), 57 deletions(-)

By how much is it faster ?
Please add benchmark scores to the commit message


> 
> diff --git a/libavfilter/unsharp.h b/libavfilter/unsharp.h
> index c2aed64..fc651c0 100644
> --- a/libavfilter/unsharp.h
> +++ b/libavfilter/unsharp.h
> @@ -41,6 +41,10 @@ typedef struct {
>      cl_kernel kernel_chroma;
>      cl_mem cl_luma_mask;
>      cl_mem cl_chroma_mask;
> +    cl_mem cl_luma_mask_x;
> +    cl_mem cl_chroma_mask_x;
> +    cl_mem cl_luma_mask_y;
> +    cl_mem cl_chroma_mask_y;
>      int in_plane_size[8];
>      int out_plane_size[8];
>      int plane_num;
> diff --git a/libavfilter/unsharp_opencl.c b/libavfilter/unsharp_opencl.c
> index 5c6b5ef..4adad63 100644
> --- a/libavfilter/unsharp_opencl.c
> +++ b/libavfilter/unsharp_opencl.c
> @@ -87,52 +87,52 @@ end:
>      return ret;
>  }
>  
> -static int compute_mask_matrix(cl_mem cl_mask_matrix, int step_x, int step_y)
> +static int copy_separable_masks(cl_mem cl_mask_x, cl_mem cl_mask_y, int step_x, int step_y)
>  {
> -    int i, j, ret = 0;
> -    uint32_t *mask_matrix, *mask_x, *mask_y;
> -    size_t size_matrix = sizeof(uint32_t) * (2 * step_x + 1) * (2 * step_y + 1);
> +    int ret = 0;
> +    uint32_t *mask_x, *mask_y;
> +    size_t size_mask_x = sizeof(uint32_t) * (2 * step_x + 1);
> +    size_t size_mask_y = sizeof(uint32_t) * (2 * step_y + 1);

> -    mask_x = av_mallocz_array(2 * step_x + 1, sizeof(uint32_t));
> +    mask_x = av_mallocz_array((2 * step_x + 1), sizeof(uint32_t));

unneeded change


>      if (!mask_x) {
>          ret = AVERROR(ENOMEM);
>          goto end;
>      }

> -    mask_y = av_mallocz_array(2 * step_y + 1, sizeof(uint32_t));
> +    mask_y = av_mallocz_array((2 * step_y + 1), sizeof(uint32_t));

unnneeded change


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150210/0c5d2b89/attachment.asc>


More information about the ffmpeg-devel mailing list