[FFmpeg-devel] [PATCH] cpu: Limit the number of auto threads in 32 bit builds
Martin Storsjö
martin at martin.st
Mon Sep 5 14:50:21 EEST 2022
On Mon, 5 Sep 2022, Andreas Rheinhardt wrote:
> Martin Storsjö:
>> Limit the returned value from av_cpu_count to sensible amounts
>> in 32 bit builds.
>>
>> This chosen limit, 64, is somewhat arbitrary - a 32 bit process
>> is capable of creating much more than 64 threads. But in many
>> cases, multiple parts of the encoding pipeline (decoder, filters,
>> encoders) all create a pool of threads, auto sized according to the
>> number of cores.
>>
>> In one failing test, the process had managed to create 506 threads
>> before a pthread_create call failed.
>>
>> In the current set of fate tests, the filter-lavd-scalenorm test
>> seems to be the limiting factor; in a 32 bit build (arm linux,
>> running on an aarch64 kernel), it starts failing with an auto
>> thread count somewhere around 85. Therefore, pick the maximum
>> with some margin below this.
>>
>> This fixes running fate without any manually set number of threads
>> in 32 bit builds on machines with huge numbers of cores.
>> ---
>> libavutil/cpu.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/libavutil/cpu.c b/libavutil/cpu.c
>> index 0035e927a5..094bd71d3d 100644
>> --- a/libavutil/cpu.c
>> +++ b/libavutil/cpu.c
>> @@ -233,6 +233,12 @@ int av_cpu_count(void)
>> nb_cpus = sysinfo.dwNumberOfProcessors;
>> #endif
>>
>> +#if SIZE_MAX <= UINT32_MAX
>> + // Avoid running out of memory/address space in 32 bit builds, by
>> + // limiting the number of auto threads.
>> + nb_cpus = FFMIN(nb_cpus, 64);
>> +#endif
>> +
>> if (!atomic_exchange_explicit(&printed, 1, memory_order_relaxed))
>> av_log(NULL, AV_LOG_DEBUG, "detected %d logical cores\n", nb_cpus);
>>
>
> I don't think we should be lying in libavutil/cpu.c.
> We should instead limit the number of threads in the functions that
> actually create threads based upon the return value of av_cpu_count();
> e.g. both frame_thread_encoder.c (limit 64) as well as pthread_frame.c
> and pthread_slice.c (limit 16) already limit these numbers.
> lavu/slicethread.c doesn't seem to do so.
Right, that's probably the more sensible thing to do.
For something like lavu/slicethread.c, it probably makes sense with a
low-ish limit, like 16 or 32 - splitting up an individual frame in more
slices than that probably doesn't make sense as automatic default, right?
// Martin
More information about the ffmpeg-devel
mailing list