[FFmpeg-devel] [PATCH 4/7] checkasm: use pointers for start/stop functions
Lynne
dev at lynne.ee
Sat Jul 15 20:43:26 EEST 2023
Jul 15, 2023, 10:26 by remi at remlab.net:
> Le lauantaina 15. heinäkuuta 2023, 11.05.51 EEST Lynne a écrit :
>
>> Jul 14, 2023, 20:29 by remi at remlab.net:
>> > This makes all calls to the bench start and stop functions via
>> > function pointers. While the primary goal is to support run-time
>> > selection of the performance measurement back-end in later commits,
>> > this has the side benefit of containing platform dependencies in to
>> > checkasm.c and out of checkasm.h.
>> > ---
>> >
>> > tests/checkasm/checkasm.c | 33 ++++++++++++++++++++++++++++-----
>> > tests/checkasm/checkasm.h | 31 ++++---------------------------
>> > 2 files changed, 32 insertions(+), 32 deletions(-)
>>
>> Not sure I agree with this commit, the overhead can be detectable,
>> and we have a lot of small functions with runtime a few times that
>> of a null function call.
>>
>
> I don't think the function call is ever null. The pointers are left NULL only
> if none of the backend initialise. But then, checkasm will bail out and exit
> before we try to benchmark anything anyway.
>
> As for the real functions, they always do *something*. None of them "just
> return 0".
>
I meant a no-op function call to measure the overhead of function
calls themselves, complete with all the ABI stuff.
>> Can you store the function pointers out of the loop to reduce
>> the derefs needed?
>>
>
> Taking just the two loads is out of the loop should be feasible but it seems a
> rather vain. You will still have the overhead of the indirect function call,
> the function, and most importantly in the case of Linux perf and MacOS kperf,
> the system calls.
>
> The only way to avoid the indirect function calls are to use IFUNC (tricky and
> not portable), or to make horrible macros to spawn one bench loop for each
> backend.
>
> In the end, I think we should rather aim for as constant time as possible,
> rather than as fast as possible, so that the nop loop can estimate the
> benchmarking overhead as well as possible. In this respect, I think it is
> actually marginally better *not* to cache the function pointers in local
> variables, which could end up spilled on the stack, or not, depending on local
> compiler optimisations for any given test case.
>
I disagree, uninlining the timer fetches adds another source of
inconsistency. It may be messy, but I think accuracy here is more
important than cleanliness, especially as it's a development tool.
More information about the ffmpeg-devel
mailing list