[FFmpeg-devel] Escaping from the escaping madness

Stefano Sabatini stefasab at gmail.com
Thu Nov 15 19:39:18 CET 2012


On date Thursday 2012-11-15 15:11:53 +0100, Nicolas George encoded:
> Hi.
> 
> With filters and options and various parsers, we have reached a point where
> some part of options need to be escaped four times. This is insane and very
> user-unfriendly. We may want to look for solutions.
> 
> 1. Rework the parsing mechanism to reduce the need for escaping. For
>    example, "foo,bar\\:qux" parsed into "foo" and "bar\:qux", and then the
>    second into "bar" and "qux": it would be possible to rework the parser to
>    avoid the need for doubling the backslash.
>    I am rather against this solution, because it is probably quite complex
>    to implement, and also because it is very program-unfriendly: if you are
>    using ffmpeg directly, you can fiddle with escaping until you find the
>    right level, but if you are writing a program that calls ffmpeg or the
>    library API, you need accurate and predictable rules for escaping level
>    interactions.
> 

> 2. Read from files. It solves the problem, but it is rather annoying:
>    creating temporary files and ensuring that they are cleaned up is always
>    a hassle, and also a security concern if done improperly; it also raises
>    security issues by itself ("-vf drawtext=text=@</etc/passwd" anyone?).
>    Not a very good general solution IMHO.
> 

> 3. Use environment variables. It can be a very easy and immediate fix: patch
>    av_get_token so that if the string is "optional spaces; dollar sign;
>    alphanumeric characters; optional spaces; end of string or delimiter", it
>    returns the contents of the corresponding environment variable. Then it
>    is only a matter of putting enough (2^n) backslashed in front of the
>    dollar.
> 
>    I can propose a patch rather soon if people think it is a good idea.

Yes, but this introduces another kind of annoyment. I'm not opposing
the idea, just want to point out that there is no perfect solution but
several sub-optimal solutions with strong and weak points.

> 4. Use configurable and/or asymmetric quotes. In Perl, instead of "foo", you
>    can write q/foo/, and then you do not need to escape the double-quote
>    (but you need to escape the slash instead); even better, you can write
>    q{foo}, and then you only need to escape the braces if they do not nest
>    properly. Similar concern: in shell, $(cmd) is preferred to `cmd` because
>    parentheses will be matched and avoid escaping madness for nested command
>    expansions. Another example: in M4, strings are quoted using `...', and
>    can be nested if the quotes properly match.
> 
>    We could adopt a similar scheme for ffmpeg.

This is just moving the target and adding complexity.

>    Also, notice that we are in the XXIth century: Unicode exists, and has a
>    hoard of quote pairs that nobody uses:
> 
>    U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
>    U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
>    U+275B HEAVY SINGLE TURNED COMMA QUOTATION MARK ORNAMENT
>    U+275C HEAVY SINGLE COMMA QUOTATION MARK ORNAMENT
>    U+275D HEAVY DOUBLE TURNED COMMA QUOTATION MARK ORNAMENT
>    U+275E HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT
>    U+301D REVERSED DOUBLE PRIME QUOTATION MARK
>    U+301E DOUBLE PRIME QUOTATION MARK
>    
>    Or, simpler:
> 
>    U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
>    U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
>    U+2018 LEFT SINGLE QUOTATION MARK
>    U+2019 RIGHT SINGLE QUOTATION MARK
>    U+201C LEFT DOUBLE QUOTATION MARK
>    U+201D RIGHT DOUBLE QUOTATION MARK
> 

>    which are sometimes directly available on keyboards (at least X11/fr;
>    (the first two are even in latin-1), but that some people may actually be
>    using.

Don't assume this, especially in countries where people don't use
accents so frequently (English). Also such accents are not visually
distinguishable, so we would end up with tons of "your quote chars are
not the right ones".

>    I am not sure what is the best solution to make something without
>    breaking compatibility too much, but I believe there are some solutions
>    to be found.

Regarding filtergraph syntax, it was noted that it was *not* meant for
containing generic strings (but it was meant to be convenient to write
*simple* graphs in the commandline). We could design an alternative
syntax.

For example:
drawtext(text="drawtext=this is \"text\": with a lot of (funny) chars", fontfile=FreeSerif.tta)

it would cut down the need for escaping, and yes it would be much more
verbose and it would be supposed to be read from a file. The
alternative verbose/escape-free syntax may convive with the current
one (we would just need two different selectable parsers).
-- 
FFmpeg = Fantastic and Fabulous Maxi Powerful Enchanting God


More information about the ffmpeg-devel mailing list