[FFmpeg-devel] [PATCH] aarch64: Use cntvct_el0 as timer register on Android

Zhao Zhili quinkblack at foxmail.com
Fri Jun 14 12:11:48 EEST 2024



> On Jun 13, 2024, at 20:54, Martin Storsjö <martin at martin.st> wrote:
> 
> On Fri, 7 Jun 2024, Martin Storsjö wrote:
> 
>> The default timer register pmccntr_el0 usually requires enabling
>> access with e.g. a kernel module.
>> ---
>> cntvct_el0 has significantly better resolution than
>> av_gettime_relative (while the unscaled nanosecond output of
>> clock_gettime is much higher resolution).
>> 
>> In one tested case, the cntvct_el0 timer has a frequency of 25 MHz
>> (readable via the register cntfrq_el0).
>> ---
>> libavutil/aarch64/timer.h | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>> 
>> diff --git a/libavutil/aarch64/timer.h b/libavutil/aarch64/timer.h
>> index fadc9568f8..966f17081a 100644
>> --- a/libavutil/aarch64/timer.h
>> +++ b/libavutil/aarch64/timer.h
>> @@ -33,7 +33,16 @@ static inline uint64_t read_time(void)
>>    uint64_t cycle_counter;
>>    __asm__ volatile(
>>        "isb                   \t\n"
>> +#if defined(__ANDROID__)
>> +        // cntvct_el0 has lower resolution than pmccntr_el0, but is usually
>> +        // accessible from user space by default.
>> +        "mrs %0, cntvct_el0        "
>> +#else
>> +        // pmccntr_el0 has higher resolution, but is usually not accessible
>> +        // from user space by default (but access can be enabled with a custom
>> +        // kernel module).
>>        "mrs %0, pmccntr_el0       "
>> +#endif
>>        : "=r"(cycle_counter) :: "memory" );
> 
> Zhao, does this implementation seem useful to you? Does it give you better (more accurate, less noisy?) benchmarking numbers on Android, than the fallback based on clock_gettime?

Hi Martin, this works on Android and macOS both, so maybe you can enable it for macOS too.

I have compared the result of this implementation and mach_absolute_time, this looks like
the implementation has smaller variable Deviation than mach_absolute_time. I guess the result
is the same when compared to clock_gettime.

We have linux perf on Android, and kperf on macOS. Linux perf has the benefit to reduce interference
from other processes on statistical results, if I understand correctly. I’m not sure about the benefit of
macOS kperf.l

> 
> // Martin
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org <mailto:ffmpeg-devel at ffmpeg.org>
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-request at ffmpeg.org <mailto:ffmpeg-devel-request at ffmpeg.org> with subject "unsubscribe".



More information about the ffmpeg-devel mailing list