[FFmpeg-devel] [PATCH 1/3] avutil: add staticpool

Muhammad Faiz mfcc64 at gmail.com
Sun Jan 21 05:24:21 EET 2018


On Sat, Jan 20, 2018 at 5:22 PM, wm4 <nfxjfg at googlemail.com> wrote:
> On Sat, 20 Jan 2018 11:29:13 +0700
> Muhammad Faiz <mfcc64 at gmail.com> wrote:
>
>> Help avoiding malloc-free cycles when allocating-freeing common
>> structures.
>>
>> Signed-off-by: Muhammad Faiz <mfcc64 at gmail.com>
>> ---
>>  libavutil/staticpool.h | 117 +++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 117 insertions(+)
>>  create mode 100644 libavutil/staticpool.h
>>
>> diff --git a/libavutil/staticpool.h b/libavutil/staticpool.h
>> new file mode 100644
>> index 0000000000..9c9b2784bc
>> --- /dev/null
>> +++ b/libavutil/staticpool.h
>> @@ -0,0 +1,117 @@
>> +/*
>> + * This file is part of FFmpeg.
>> + *
>> + * FFmpeg is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU Lesser General Public
>> + * License as published by the Free Software Foundation; either
>> + * version 2.1 of the License, or (at your option) any later version.
>> + *
>> + * FFmpeg is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * Lesser General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU Lesser General Public
>> + * License along with FFmpeg; if not, write to the Free Software
>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>> + */
>> +
>> +#ifndef AVUTIL_STATICPOOL_H
>> +#define AVUTIL_STATICPOOL_H
>> +
>> +#include <stdatomic.h>
>> +#include "avassert.h"
>> +#include "mem.h"
>> +
>> +/**
>> + * FF_STATICPOOL allocate memory without av_malloc if possible
>> + * @param size must be 2^n between 64 and 4096
>> + */
>> +#define FF_STATICPOOL_DECLARE(type, size)                                       \
>> +typedef struct type##_StaticPoolWrapper {                                       \
>> +    type        buf;                                                            \
>> +    unsigned    index;                                                          \
>> +    atomic_uint next;                                                           \
>> +} type##_StaticPoolWrapper;                                                     \
>> +                                                                                \
>> +static atomic_uint              type##_staticpool_next;                         \
>> +static atomic_uint              type##_staticpool_last;                         \
>> +static type##_StaticPoolWrapper type##_staticpool_table[size];                  \
>> +                                                                                \
>> +static type *type##_staticpool_malloc(void)                                     \
>> +{                                                                               \
>> +    unsigned val, index, serial, new_val;                                       \
>> +                                                                                \
>> +    av_assert0((size) >= 64 && (size) <= 4096 && !((size) & ((size) - 1)));     \
>> +                                                                                \
>> +    /* use serial, avoid spinlock */                                            \
>> +    /* acquire, so we don't get stalled table[index].next */                    \
>> +    val = atomic_load_explicit(&type##_staticpool_next, memory_order_acquire);  \
>> +    do {                                                                        \
>> +        index  = val & ((size) - 1);                                            \
>> +        serial = val & ~((size) - 1);                                           \
>> +        new_val = atomic_load_explicit(&type##_staticpool_table[index].next, memory_order_relaxed) | (serial + (size)); \
>> +    } while (!atomic_compare_exchange_strong_explicit(&type##_staticpool_next, &val, new_val, \
>> +                                                      memory_order_acquire, memory_order_acquire)); \
>> +                                                                                \
>> +    index = val & ((size) - 1);                                                 \
>> +    if (index)                                                                  \
>> +        return &type##_staticpool_table[index].buf;                             \
>> +                                                                                \
>> +    index = atomic_fetch_add_explicit(&type##_staticpool_last, 1, memory_order_relaxed) + 1; \
>> +    if (index < (size)) {                                                       \
>> +        type##_staticpool_table[index].index = index;                           \
>> +        return &type##_staticpool_table[index].buf;                             \
>> +    }                                                                           \
>> +                                                                                \
>> +    atomic_fetch_add_explicit(&type##_staticpool_last, -1, memory_order_relaxed); \
>> +    return av_malloc(sizeof(type));                                             \
>> +}                                                                               \
>> +                                                                                \
>> +static inline type *type##_staticpool_mallocz(void)                             \
>> +{                                                                               \
>> +    type *ptr = type##_staticpool_malloc();                                     \
>> +    if (ptr)                                                                    \
>> +        memset(ptr, 0, sizeof(*ptr));                                           \
>> +    return ptr;                                                                 \
>> +}                                                                               \
>> +                                                                                \
>> +static void type##_staticpool_free(type *ptr)                                   \
>> +{                                                                               \
>> +    type##_StaticPoolWrapper *entry = (type##_StaticPoolWrapper *) ptr;         \
>> +    unsigned val, serial, index, new_val;                                       \
>> +                                                                                \
>> +    if ((uintptr_t)ptr <= (uintptr_t)(type##_staticpool_table) ||               \
>> +        (uintptr_t)ptr >= (uintptr_t)(type##_staticpool_table + size)) {        \
>> +        av_free(ptr);                                                           \
>> +        return;                                                                 \
>> +    }                                                                           \
>> +                                                                                \
>> +    if (CONFIG_MEMORY_POISONING)                                                \
>> +        memset(&entry->buf, FF_MEMORY_POISON, sizeof(entry->buf));              \
>> +                                                                                \
>> +    val = atomic_load_explicit(&type##_staticpool_next, memory_order_relaxed);  \
>> +    do {                                                                        \
>> +        index  = val & ((size) - 1);                                            \
>> +        serial = val & ~((size) - 1);                                           \
>> +        atomic_store_explicit(&entry->next, index, memory_order_relaxed);       \
>> +        new_val = entry->index | (serial + (size));                             \
>> +    } while (!atomic_compare_exchange_strong_explicit(&type##_staticpool_next, &val, new_val, \
>> +                                                      memory_order_release, memory_order_relaxed)); \
>> +}                                                                               \
>> +                                                                                \
>> +static inline void type##_staticpool_freep(type **ptr)                          \
>> +{                                                                               \
>> +    if (ptr) {                                                                  \
>> +        type##_staticpool_free(*ptr);                                           \
>> +        *ptr = NULL;                                                            \
>> +    }                                                                           \
>> +}                                                                               \
>> +/* FF_STATICPOOL_DECLARE */
>> +
>> +#define FF_STATICPOOL_MALLOC(type) type##_staticpool_malloc()
>> +#define FF_STATICPOOL_MALLOCZ(type) type##_staticpool_mallocz()
>> +#define FF_STATICPOOL_FREE(type, v) type##_staticpool_free(v)
>> +#define FF_STATICPOOL_FREEP(type, v) type##_staticpool_freep(v)
>> +
>> +#endif
>
> I'm against this. Global mutable data is always bad. If you have 2
> uses ffmpeg in the same process, their efficiency might get halved
> because the available pool size is shared by both.

The performance will suffer if number of pool usage is exceeding
number of available pool. In that case, the performance will be equal
with av_malloc.
Just for example, assuming that one usage uses 100 pools, it will
require 10 usages to make the performance suffer.


>                                          The macro mess is
> very ugly. Can we please not define pages of code as macros.

I can write this on some part without macros. The macros mainly is for
hiding the static structs and variables.


>                                          Last but
> not least this is the job of the memory manager, which already does
> fancy things like per thread pools.

Do you mean that AVBufferPool is pointless?
I'm waiting anybody to give benchmark result between staticpool and
newer version of glibc that uses per thread pools. I will drop this
patch if staticpool is slower or on par with glibc.


>                                          I don't trust the atomics use
> either, I'm don't want to have to debug that ever.

Of course, using atomics is more complicated that using mutex (with
benefits that it will be faster when properly used).
But it is not a valid reason to avoid using atomic because it is more
complicated.

Thank's.


More information about the ffmpeg-devel mailing list