[FFmpeg-devel] [PATCH V5 1/2] configure: sort decoder/encoder/filter/... names in alphabet order

Tue May 7 15:22:25 EEST 2019

> patch1 (awk) configure: print_in_columns: replace pr with awk version: http://ffmpeg.org/pipermail/ffmpeg-devel/2019-May/243380.html
> patch2 (shell) configure: sort decoder/encoder/filter/... names in alphabet
> order (v5 as posted in this thread)
> 
> - Why do you prefer patch1 over patch2?

In general, I agree with your reasoning, but not entirely sure about the
conclusion and its implications.

Few exceptions first:

> 1. Statistics
>     * patch1: 1 file changed, 16 insertions(+), 2 deletions(-)
>     * patch2: 1 file changed, 24 insertions(+), 2 deletions(-)

As you said later, and I agree, it's a weak argument, as both versions
are the same code and complexity, but differ mostly in style and language
syntax.

> 2. patch2 uses lots of variables (which is good in itself), but those
>   should be local and we cannot express that portably in shell.
>   In turn we have raised potential for clashes now or in the future.

This is incorrect. A subshell provides identical isolation as an external
process, it's fully portable, it can be used here, and it's not slower than
an external process, e.g. `print_in_columns() ( ... )`.

> 4. Depending on the input (stdin) unexpected things can happen in the
>   "eval" and "set" lines. One example is given in the comment.

"Unexpected" is a bit general. Specifically, the "eval" line does not
execute arbitrary inputs, and neither does the "set" line.

The only potential surprise is globbing, as documented, and the technique
of expanding variables unquoted - like here, is used throughout configure.

Look no further than 100% of the very places which pipe the input stream
into print_in_columns, and multiple times earlier than that.

It's not great of course, but it's not worse than elsewhere, and it's at
least documented. Some day maybe someone would improve it.

I generally agree with rest of your comments/arguments, including:

> 1. Statistics ... (patch2 is more verbose).
> 3. patch2 uses eval combined with non-trivial quoting, which is hard
>   to read and hard to grasp quickly. It's not wrong, but it's not as
>   easy and straight forward as in patch1.
> 5. patch2 to uses shell for the parts easily expressed in shell and
>   uses awk for the parts that are non-trivial in shell.
> 6. The awk implementation should be easier to read and understand for
>   the majority of readers/developers.

which basically say that shell is hard, and, pardon the pun, awkward.

And also I agree with:

> Shell though is not really suited for implementing all kinds of algorithms.
> OTOH shell is an excellent choice for starting processes that communicate
> via stdin/stdout/stderr and plugging them together.

and with

> Shell is best when coordinating the execution of other programs.

And similar assertions, however, let's keep the following facts in mind:

- We both agree that we should keep to shell except where an external tool
  provides some meaningful value over shell code.

- An external tool only helps by being more suitable for the task than
  shell code (readability, portability, correctness, performance, etc),
  but it doesn't help with isolation more than a subshell would.

- Shell is capable in a portable way, including for some tasks which can
  be considered "programming".

- configure is currently written in shell, including some non trivial
  functionality, and doesn't use external script engines for "scriptlets".

- Being a shell script, it already requires competence and reasonable
  understanding of shell, including quoting, expansions, etc.

- Maintaining code in more than one scripting system does carry a cost.

- Yes, shell is a lousy programming language.

I really don't care if this specific patch ends up as shell or awk. What
I do care about is being consistent, and understanding the implications.

IMHO, because I'm not convinced that the awk version provides a meaningful
additional value other than not being shell, if we use the awk version then
it sets a precedence of using awk (or maybe also other languages) for small
scripts even if they're basically the same complexity and code as they
would be in shell.

It's not necessarily a bad precedence (even if I wouldn't do that myself),
because I do agree that not being shell code does have value, but I think
it's important to acknowledge that it does set such precedence.

If there was already such practice of using awk scriptlets where possible
then I wouldn't have said anything. So I mainly care about the precedence.

I guess it comes down to a judgment call on whether or not this awk script
can be considered meaningfully more suitable for the task than shell.

I think not, but clearly it's only my subjective assessment.

Regards,
Avi