[FFmpeg-devel] [PATCH] cpu: Limit the number of auto threads in 32 bit builds

Andreas Rheinhardt andreas.rheinhardt at outlook.com
Mon Sep 5 14:54:20 EEST 2022


Martin Storsjö:
> On Mon, 5 Sep 2022, Andreas Rheinhardt wrote:
> 
>> Martin Storsjö:
>>> Limit the returned value from av_cpu_count to sensible amounts
>>> in 32 bit builds.
>>>
>>> This chosen limit, 64, is somewhat arbitrary - a 32 bit process
>>> is capable of creating much more than 64 threads. But in many
>>> cases, multiple parts of the encoding pipeline (decoder, filters,
>>> encoders) all create a pool of threads, auto sized according to the
>>> number of cores.
>>>
>>> In one failing test, the process had managed to create 506 threads
>>> before a pthread_create call failed.
>>>
>>> In the current set of fate tests, the filter-lavd-scalenorm test
>>> seems to be the limiting factor; in a 32 bit build (arm linux,
>>> running on an aarch64 kernel), it starts failing with an auto
>>> thread count somewhere around 85. Therefore, pick the maximum
>>> with some margin below this.
>>>
>>> This fixes running fate without any manually set number of threads
>>> in 32 bit builds on machines with huge numbers of cores.
>>> ---
>>>  libavutil/cpu.c | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/libavutil/cpu.c b/libavutil/cpu.c
>>> index 0035e927a5..094bd71d3d 100644
>>> --- a/libavutil/cpu.c
>>> +++ b/libavutil/cpu.c
>>> @@ -233,6 +233,12 @@ int av_cpu_count(void)
>>>      nb_cpus = sysinfo.dwNumberOfProcessors;
>>>  #endif
>>>
>>> +#if SIZE_MAX <= UINT32_MAX
>>> +    // Avoid running out of memory/address space in 32 bit builds, by
>>> +    // limiting the number of auto threads.
>>> +    nb_cpus = FFMIN(nb_cpus, 64);
>>> +#endif
>>> +
>>>      if (!atomic_exchange_explicit(&printed, 1, memory_order_relaxed))
>>>          av_log(NULL, AV_LOG_DEBUG, "detected %d logical cores\n",
>>> nb_cpus);
>>>
>>
>> I don't think we should be lying in libavutil/cpu.c.
> 
>> We should instead limit the number of threads in the functions that
>> actually create threads based upon the return value of av_cpu_count();
>> e.g. both frame_thread_encoder.c (limit 64) as well as pthread_frame.c
>> and pthread_slice.c (limit 16) already limit these numbers.
>> lavu/slicethread.c doesn't seem to do so.
> 
> Right, that's probably the more sensible thing to do.
> 
> For something like lavu/slicethread.c, it probably makes sense with a
> low-ish limit, like 16 or 32 - splitting up an individual frame in more
> slices than that probably doesn't make sense as automatic default, right?
> 

Yes. E.g. MAX_AUTO_THREADS in pthread_slice.c is 16.

- Andreas


More information about the ffmpeg-devel mailing list