bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Assigning RegExp Variables on the Command Line


From: Stephane Chazelas
Subject: Re: [bug-gawk] Assigning RegExp Variables on the Command Line
Date: Mon, 7 Oct 2019 17:24:45 +0100
User-agent: NeoMutt/20171215

2019-10-07 02:58:40 -0600, address@hidden:
[...]
> I think you're overreacting.

I didn't mean to imply the feature is terribly bad and needs to
be removed, just that it is surprising and breaks users
expectations (it broke my expectations at least).

> The probabilities of such a value for a
> shell variable are from low to zero.  If you really need something like that
> you can use
> 
>       gawk -e "BEGIN { myvar = \"$shellvar\" }" ...
> 
> instead of -v.

That's a lot worse as it can become an arbitrary command injection
vulnerability. (like with shellvar='"; system("reboot"); a="')

As mentioned in most of the links I referenced, a portable (to
POSIX awks) way to pass data verbatim is to do:

ENVVAR=$shellvar awk 'BEGIN{myvar = ENVIRON["ENVVAR"]}'

The fact that:

awk -v awkvar="$shellvar" '...'
or
awk ... awkvar="$shellvar"

cannot be used reliably because of the backslash processing  is
not well known even though it affects all awk implementations,
so it would be useful to have it in the manual

Note that I've asked the Austin Group to add a mention to that
in the POSIX specification and it's been accepted
http://austingroupbugs.net/bug_view_page.php?bug_id=1105#c4019

The next revision will have:

    Since <backslash> has a special meaning both in the
    <assignment> option-argument to the -v option and in the
    assignment operand, applications that need to pass strings
    to awk without special interpretation of <backslash> should
    not use these methods but should instead make use of the
    ARGV or ENVIRON array

Now, people who are aware of that awk caveat may be tempted to
do:

awk -v var="${var//\\/\\\\}" '...'

to work around the problem (assuming
ksh93/bash/zsh/mksh/yash/busybox-sh shells), but that no longer
works with gawk unless $POSIXLY_CORRECT is enabled, and IMO,
there's little chance anyone even highly versed into the
specifics of gawk will discover/realise that by themselves at
the moment by reading the manual.

As you say, in real "normal" circumstance, $var is unlikely to
start with @/ and end in /, so though a

awk -v var="${var//\\/\\\\}" '...'

code in a script became a bug after the release of gawk 4.2,
that's a bug unlikely to occur.

But taking my vulnerability catcher hat here, like for any bug,
it becomes a bigger problem if it can be abused.

Like here with var='@/(/', gawk crashes so your script code has
a DoS vulnerability, with var='@/3/', you get "3" or "0"
depending on whether x is interpreted as a string or number, \
is no longer treated specially... Plenty of ways one could try
and abuse a piece of code that uses "-v
var=attacker-provided-data", so it's important one realises they
should not use "-v" here.

-- 
Stephane




reply via email to

[Prev in Thread] Current Thread [Next in Thread]