[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensu
From: |
Wolfgang Laun |
Subject: |
Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ? |
Date: |
Sat, 4 Apr 2020 06:41:20 +0200 |
If I understand everything correctly, you are trying to replace some
%abc%
in an input line by the value of the environment variable abc.
This cannot be done using a single gsub, because the backref \\1 only works
within a string literal that is to be the complete replacement text. What
you need is an additional evaluation, of the expression "ENVIRON[" "\\1"
"]", to be inserted in place of the %-% placeholder. If awk had eval, you
could write:
print gensub(/%([_A-Z]+)%/, eval("ENVIRON[\"\\1\"]"), "g") # this is
not awk
You might use Perl, where substitution (s///) has a flag 'e', requesting
the replacement to be evaluated as an expression to become the text to be
inserted, i.e., an implied eval.
echo 'repository=%MY_URL% # %COMMENT%' | \
COMMENT="Set repository URL" MY_URL="http://www.example.com" \
perl -e 'while( <> ){s/%([_A-Z]+)%/$ENV{$1}/ge; print;}'
An awk version requires a user-defined function:
function envsub(text, chunk){
while( match(text, /([^%]*)%([_A-Z]+)%(.*)/, chunk) != 0 ){
sub( "%"chunk[2]"%", ENVIRON[chunk[2]], text )
}
return text;
}
{ print envsub($0) }
Wolfgang
On Fri, 3 Apr 2020 at 18:35, Vincent Férotin <address@hidden>
wrote:
> Hi gawk maintainers!
>
> New to awk/gawk/mawk, I'd like to describe here what could possibly be a
> bug,
> at least a limitation, I encountered in these tools for my basic usage.
> Perhaps what follows is not a bug but a miscomprehension of me-as-newbee?
> Anyway, thanks in advance for reading this...
>
> V.F.
>
>
> TL;DR
> =====
>
> Using [gm]awk as a templating/macro engine, following shell commands
> do not output what could be expected:
>
> $ echo "repository=%MY_URL% # %COMMENT%" |COMMENT="Set repository
> URL" MY_URL="http://www.example.com" awk '{print gensub(/%([_A-Z]+)%/,
> ENVIRON["\\1"], "g")}'
> repository= #
>
> or roughly equivalent:
>
> $ echo "repository=MY_URL # COMMENT" |COMMENT="Set repository
> URL" MY_URL="http://www.example.com" awk '{gsub(/[_A-Z]+/,
> ENVIRON["&"]); print $0}'
> repository= #
>
> It seems that "\\1" of gensub() (or "&" for gsub()) is not well escaped
> with content providing from what regexp. captured, at least in the context
> of indexing ENVIRON. Expected output should be, IMHO and as far as I
> understand:
>
> repository=http://www.example.com # Set repository URL
>
>
> Versions tested
> ===============
>
> * gawk:
> - 4.1.4 (Ubuntu 18.04 Bionic)
> - 4.2.1 (Ubuntu 19.10 Eoan)
> - 5.0.1 (Ubuntu 20.04 Focal)
> * mawk:
> - 3.3 (Ubuntu 18.04 Bionic & 19.10 Eoan)
> - 3.4.20200120 (Ubuntu 20.04 Focal)
>
>
> Usage
> =====
>
> In order to provision some virtual machine with Bash scripts,
> I used 'sed' for replacing some paths (string) or
> configuration file contents, but fail for some usages, where replaced
> string
> contains some chars. 'sed' could interpret as metachars (such as "/").
>
> I then tried using 'm4', where effective values to replace placeholders are
> available as environment variables.
> But Debian/Ubuntu packaging seems to have some limitations, notably by
> disabling
> '-W, --word-regexp=REGEXP' option (expected to allow setting
> placeholder regexp.,
> for e.g. "%([_A-Z]+)%").
> Using m4 as is, with its available configuration as chosen by
> packaging maintainers,
> is feasible:
>
> $ echo "changecom\nrepository=MY_URL # COMMENT" | m4
> -DMY_URL="$MY_URL" -DCOMMENT="$COMMENT"
>
> repository=http://www.example.com # Set repository URL
>
> but I miss choosing a more robust placeholder delimiters
> (I started here by pre- and suffixing them by "%",
> but I also could have chosen an other format, such as the more common
> "${var}").
>
> It seems that this need still exists outside my sole and naïve usage,
> see for example:
> -
> https://stackoverflow.com/questions/415677/how-to-replace-placeholders-in-a-text-file
> -
> https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash
>
> Note that, outside an alone answer (over a total of 40 (16+24 at time
> of this writing)):
> -
> https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash#answer-9590655
> no valid answer use awk or one of its derivates!
> (NB: This specific answer could probably suffice for my needs...)
>
>
> Evidences that `gensub(..., ENVIRON["\\1"])` should work
> ========================================================
>
> Using "\\1" in gensub() is well escaped:
>
> $ echo "repository=%MY_URL% # %COMMENT%" | awk '{print
> gensub(/%([_A-Z]+)%/, "( \\1 )", "g")}'
> repository=( MY_URL ) # ( COMMENT )
>
> Passing directly desired var. name to ENVIRON also works:
>
> $ echo "repository=%MY_URL%" |MY_URL="http://www.example.com" awk
> '{print gensub(/%MY_URL%/, ENVIRON["MY_URL"], "g")}'
> repository=http://www.example.com
>
>
> `ENVIRON` seems to not accept other expressions as index
> ========================================================
>
> Note also that trying to re-write awk script provided by above
> StackOverflow answer
> described in
> https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash#answer-9590655
> that is:
>
> 'match($0, "[$]{.*}") {var = substr($0, (RSTART + 2), (RLENGTH -
> 3)); gsub("[$]{"var"}", ENVIRON[var])}1'
>
> into more condensed and adapted to my use case:
>
> '{gensub(/%([_A-Z]+)%/, ENVIRON[substr("\\1", 1, (length("\\1") -
> 2))])}' # gawk
> '{gsub(/%[_A-Z]+%/, ENVIRON[substr("&", 1, (length("&") - 1))]);
> print $0}' # mawk
>
> does not work either.
>
>
> Search for previous existing occurrences of `gensub(..., ENVIRON["\\1"])`
> ========================================================================
>
> No occurrence of ``ENVIRON[`` with other type of index than plain
> string or variable
> were found in:
>
> * `sed and awk Pocket Reference` by Arnold Robbins (O'Reilly, 2002, 2nd
> ed.)
> http://shop.oreilly.com/product/9780596003524.do
> * `sed & awk` by Dale Dougherty & Arnold Robbins (O'Reilly, 1997, 2nd ed.)
> http://shop.oreilly.com/product/9781565922259.do
> * `Effective awk Programming` by Arnold Robbins (O'Reilly, 2015, 4th ed.)
> http://shop.oreilly.com/product/0636920033820.do
> * `GNU awk - awesome one-liners` by Sundeep Agarwal (version 0.7)
> https://learnbyexample.github.io/books/
> (pointed recently in HackerNews:
> https://news.ycombinator.com/item?id=22758217 )
> * `bug-gawk` archives
> https://lists.gnu.org/archive/html/bug-gawk/
>
>
- Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?, Vincent Férotin, 2020/04/03
- Message not available
- Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?,
Wolfgang Laun <=
- Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?, Vincent Férotin, 2020/04/06
- Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?, Wolfgang Laun, 2020/04/06
- Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?, arnold, 2020/04/06
- Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?, Vincent Férotin, 2020/04/06