[FFmpeg-devel] [PATCH 2/7] checkasm: improve print format

Lynne dev at lynne.ee
Tue Aug 13 19:39:09 EEST 2024


On 13/08/2024 16:03, J. Dekker wrote:
> Port dav1d's checkasm output format to FFmpeg's checkasm, includes
> relative speedups and aligns results.
> 
> Signed-off-by: J. Dekker <jdek at itanimul.li>
> ---
>   tests/checkasm/checkasm.c | 53 +++++++++++++++++++++++++++++++++++----
>   1 file changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
> index f82ee0864f..0095758268 100644
> --- a/tests/checkasm/checkasm.c
> +++ b/tests/checkasm/checkasm.c
> @@ -18,6 +18,31 @@
>    * You should have received a copy of the GNU General Public License along
>    * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
>    * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> + *
> + * Copyright © 2018, VideoLAN and dav1d authors
> + * Copyright © 2018, Two Orioles, LLC
> + * All rights reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright notice, this
> + *    list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright notice,
> + *    this list of conditions and the following disclaimer in the documentation
> + *    and/or other materials provided with the distribution.
> + *
> + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
> + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
> + * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>    */
>   
>   #include "config.h"
> @@ -575,6 +600,16 @@ static int measure_nop_time(void)
>       return nop_sum / 500;
>   }
>   
> +static inline double avg_cycles_per_call(const CheckasmPerf *const p)
> +{
> +    if (p->iterations) {
> +        const double cycles = (double)(10 * p->cycles) / p->iterations - state.nop_time;
> +        if (cycles > 0.0)
> +            return cycles / 4.0; /* 4 calls per iteration */
> +    }
> +    return 0.0;
> +}
> +
>   /* Print benchmark results */
>   static void print_benchs(CheckasmFunc *f)
>   {
> @@ -584,17 +619,25 @@ static void print_benchs(CheckasmFunc *f)
>           /* Only print functions with at least one assembly version */
>           if (f->versions.cpu || f->versions.next) {
>               CheckasmFuncVersion *v = &f->versions;
> +            const CheckasmPerf *p = &v->perf;
> +            const double baseline = avg_cycles_per_call(p);
> +            double decicycles;
>               do {
> -                CheckasmPerf *p = &v->perf;
>                   if (p->iterations) {
> -                    int decicycles = (10*p->cycles/p->iterations - state.nop_time) / 4;
> +                    p = &v->perf;
> +                    decicycles = avg_cycles_per_call(p);
>                       if (state.csv) {
>                           const char sep = state.tsv ? '\t' : ',';
> -                        printf("%s%c%s%c%d.%d\n", f->name, sep,
> +                        printf("%s%c%s%c%.1f\n", f->name, sep,
>                                  cpu_suffix(v->cpu), sep,
> -                               decicycles / 10, decicycles % 10);
> +                               decicycles / 10.0);
>                       } else {
> -                        printf("%s_%s: %d.%d\n", f->name, cpu_suffix(v->cpu), decicycles/10, decicycles%10);
> +                        const int pad_length = 10 + 50 -
> +                            printf("%s_%s:", f->name, cpu_suffix(v->cpu));
> +                        const double ratio = decicycles ?
> +                            baseline / decicycles : 0.0;
> +                        printf("%*.1f (%5.2fx)\n", FFMAX(pad_length, 0),
> +                            decicycles / 10.0, ratio);
>                       }
>                   }
>               } while ((v = v->next));

How does it improve it?

You're only interested in the last X iterations, after cache has fully 
warmed up and is out of the equation. Averaging all results from all 
iteration would be also benchmarking the memory layout of the system, 
but only the cycles are of interest.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xA2FEA5F03F034464.asc
Type: application/pgp-keys
Size: 624 bytes
Desc: OpenPGP public key
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240813/b7fe2c01/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20240813/b7fe2c01/attachment.sig>


More information about the ffmpeg-devel mailing list