[FFmpeg-devel] [PATCH V3 1/2] configure: sort decoder/encoder/filter/... names in alphabet order

Tue Apr 23 15:13:39 EEST 2019

>  print_in_columns() {
> -    cols=$(expr $ncols / 24)
> -    cat | tr ' ' '\n' | sort | pr -r "-$cols" -w $ncols -t
> +    col_width=24
> +    cols=$(expr $ncols / $col_width)
> +    rows=$(expr $(expr $# + $cols - 1) / $cols)
> +    for row in $(seq $rows); do
> +        index=$row
> +        line=""
> +        fmt=""
> +        for col in $(seq $cols); do
> +            if [ $index -le $# ]; then
> +                eval line='"$line "${'$index'}'
> +                fmt="$fmt%-${col_width}s"
> +            fi
> +            index=$(expr $index + $rows)
> +        done
> +        printf "$fmt\n" $line
> +    done | sed 's/ *$//'
> }

The new code is relatively slow.

On linux it adds ~ 1.5s (1500 ms) to the configure time in both bash and dash
on my system, which is roughly additional ~20% to configure run time.
On Windows this will easily add a minute or more (I didn't test).

Few things to consider:

- print_in_column iterates over a _lot_ of values - hundreds or more.

- Subshells - `$(cmd ...)` are relatively very expensive, especially in hot
  (inner) loops, and especially on Windows.

- $(expr ...) can typically be replaced with shell arithmetics - $((...)) .
  Your part 2 already uses it, and regardless it's been used in configure
  before (in pushvar and popvar), so you should use it where possible.

- `for col in $(seq $cols)` need not invoke `seq` on each iterations to always
  produce the same output. You can capture its output once on startup.

- All the places which use the new print_in_columns want the result sorted and
  duplicate pipe through `tr` and `sort`. The original version already did
  `cat | tr ' ' '\n' | sort` in one place, and you can simply replace it with
  `set -- $(cat | tr ' ' '\n' | sort)` to get the values as positional
  parameters. This will also keep the interface the same as before and will
  reduce the patch size.

And finally, is there a good reason to sort the results in columns and not rows?
As you rightfully mentioned, outputs can span several pages, and when reading
it on screen it might be more convenient to read row by row than column by
column. This could also simplify the new code a lot.