[FFmpeg-devel] [PATCH v2] cpu: add a function for querying maximum required data alignment

Michael Niedermayer michael at niedermayer.cc
Mon Sep 4 22:18:35 EEST 2017


On Sat, Sep 02, 2017 at 09:47:38PM -0300, James Almer wrote:
> From: Anton Khirnov <anton at khirnov.net>
> 
> (cherry picked from commit e6bff23f1e11aefb16a2b5d6ee72bf7469c5a66e)
> Signed-off-by: James Almer <jamrial at gmail.com>
> ---
> This is (afaics) the last API introduced to libav before the major bump.
> 
> Now checking all the x86 flags that would require aligment of 16 bytes
> or more.
> 
>  doc/APIchanges      |  3 +++
>  libavutil/cpu.c     | 35 +++++++++++++++++++++++++++++++++++
>  libavutil/cpu.h     | 13 +++++++++++++
>  libavutil/version.h |  2 +-
>  4 files changed, 52 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 4effbf9364..6a57c210a9 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -15,6 +15,9 @@ libavutil:     2015-08-28
>  
>  API changes, most recent first:
>  
> +2017-09-xx - xxxxxxx - lavu 55.75.100 / lavu 55.31.0 - cpu.h
> +  Add av_cpu_max_align() for querying maximum required data alignment.
> +
>  2017-09-01 - xxxxxxx - lavf 57.81.100 - avio.h
>    Add avio_read_partial().
>  
> diff --git a/libavutil/cpu.c b/libavutil/cpu.c
> index a22da0fa8c..4f04da2460 100644
> --- a/libavutil/cpu.c
> +++ b/libavutil/cpu.c
> @@ -16,9 +16,11 @@
>   * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>   */
>  
> +#include <stddef.h>
>  #include <stdint.h>
>  #include <stdatomic.h>
>  
> +#include "attributes.h"
>  #include "cpu.h"
>  #include "cpu_internal.h"
>  #include "config.h"
> @@ -299,3 +301,36 @@ int av_cpu_count(void)
>  
>      return nb_cpus;
>  }
> +
> +size_t av_cpu_max_align(void)
> +{
> +    int av_unused flags = av_get_cpu_flags();
> +
> +#if ARCH_ARM || ARCH_AARCH64
> +    if (flags & AV_CPU_FLAG_NEON)
> +        return 16;
> +#elif ARCH_PPC
> +    if (flags & AV_CPU_FLAG_ALTIVEC)
> +        return 16;
> +#elif ARCH_X86
> +    if (flags & (AV_CPU_FLAG_AVX2     |
> +                 AV_CPU_FLAG_AVX      |
> +                 AV_CPU_FLAG_FMA4     |
> +                 AV_CPU_FLAG_FMA3))
> +        return 32;
> +    if (flags & (AV_CPU_FLAG_XOP      |
> +                 AV_CPU_FLAG_AESNI    |
> +                 AV_CPU_FLAG_SSE42    |
> +                 AV_CPU_FLAG_SSE4     |
> +                 AV_CPU_FLAG_SSSE3    |
> +                 AV_CPU_FLAG_SSE3     |
> +                 AV_CPU_FLAG_SSE2     |
> +                 AV_CPU_FLAG_SSE      |
> +                 AV_CPU_FLAG_AVXSLOW  |
> +                 AV_CPU_FLAG_SSE3SLOW |
> +                 AV_CPU_FLAG_SSE2SLOW))
> +        return 16;
> +#endif
> +
> +    return 8;
> +}
> diff --git a/libavutil/cpu.h b/libavutil/cpu.h
> index de05593446..9e5d40affe 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -21,6 +21,8 @@
>  #ifndef AVUTIL_CPU_H
>  #define AVUTIL_CPU_H
>  
> +#include <stddef.h>
> +
>  #include "attributes.h"
>  
>  #define AV_CPU_FLAG_FORCE    0x80000000 /* force usage of selected flags (OR) */


> @@ -113,4 +115,15 @@ int av_parse_cpu_caps(unsigned *flags, const char *s);
>   */
>  int av_cpu_count(void);
>  
> +/**
> + * Get the maximum data alignment that may be required by FFmpeg.
> + *
> + * Note that this is affected by the build configuration and the CPU flags mask,
> + * so e.g. if the CPU supports AVX, but libavutil has been built with
> + * --disable-avx or the AV_CPU_FLAG_AVX flag has been disabled through
> + *  av_set_cpu_flags_mask(), then this function will behave as if AVX is not
> + *  present.
> + */
> +size_t av_cpu_max_align(void);

This might interact badly with runtime cpu flags/mask changes

If its used to choose the alignment for allocated frames and
after some are allocated the cpu flags are changed there could be
still frames in queues that may not have sufficient alignment for the
new flags


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Its not that you shouldnt use gotos but rather that you should write
readable code and code with gotos often but not always is less readable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170904/01c5ae7f/attachment.sig>


More information about the ffmpeg-devel mailing list