bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in g


From: Vincent Férotin
Subject: Re: Misunderstood, bug or limitation of indexing ENVIRON with "\\1" in gensub() ?
Date: Mon, 6 Apr 2020 17:59:46 +0200

Hi Arnold,

Thank you guys for all the statements and arguments
regarding what could be or not gawk. :-)
It's much more clear in my newbie's mind.

Again, thank you all for taking time to well answer my pretty large message!

V.F.

Le lun. 6 avr. 2020 à 15:20, <address@hidden> a écrit :
>
> Gawk goes well beyond POSIX in many ways, but there's no intent to
> go any further.
>
> So your answer is correct, there won't be an eval facility or
> similar added.  Anyone who really needs that should use the shell,
> or perl, or something else that provides it.
>
> Thanks,
>
> Arnold
>
> Wolfgang Laun <address@hidden> wrote:
>
> > Well, I'm not a developer, but I understand that gawk is not intended to go
> > far beyond what POSIX has standardized.
> >
> > String interpolation is merely syntactic sugar, as you can (given variables
> > "is" and "interpolated" even now write
> >    "This " is " an " interpolated " string."
> >
> > Implementation of a workalike to the Shell's or Perl's eval() function is
> > an entirely different thing. Incremental compilation requires that the
> > runtime is capable of arbitrarily switching between parsing and executing.
> > Nobody will want to open this can of worms.
> >
> > Wolfgang
> >
> >
> > On Mon, 6 Apr 2020 at 14:22, Vincent Férotin <address@hidden>
> > wrote:
> >
> > > Hey Wolfgang, thank you very much for the detailed answer!
> > >
> > > You perfectly understand my needs and I greatly appreciated your
> > > solutions proposals. :-)
> > >
> > > Beyond my little and rather anecdotal needs, and I understand that awk
> > > in its current state
> > > does not works as I previously expected, one minor intend of my previous
> > > message
> > > to bug-gawk mailing-list was to ask you, developers, if eventually
> > > such a feature is (or is not) desirable
> > > for a future version of awk? That is, should a-future-awk could do
> > > string interpolation
> > > and in-place evaluation, and interprets all "\\1" occurrences in a
> > > g(en)sub context
> > > than sole replacement string?
> > >
> > > Anyway, thanks again!
> > >
> > > V.F.
> > >
> > >
> > >
> > > Le sam. 4 avr. 2020 ą 06:41, Wolfgang Laun <address@hidden> a
> > > écrit :
> > > >
> > > >
> > > > If I understand everything correctly, you are trying to replace some
> > > >    %abc%
> > > > in an input line by the value of the environment variable abc.
> > > >
> > > > This cannot be done using a single gsub, because the backref \\1 only
> > > works within a string literal that is to be the complete replacement text.
> > > What you need is an additional evaluation, of the expression "ENVIRON["
> > > "\\1"  "]", to be inserted in place of the %-% placeholder. If awk had
> > > eval, you could write:
> > > >
> > > >      print gensub(/%([_A-Z]+)%/, eval("ENVIRON[\"\\1\"]"), "g")   # this
> > > is not awk
> > > >
> > > > You might use Perl, where substitution (s///) has a flag 'e', requesting
> > > the replacement to be evaluated as an expression to become the text to be
> > > inserted, i.e., an implied eval.
> > > >
> > > > echo 'repository=%MY_URL%  # %COMMENT%'  | \
> > > >    COMMENT="Set repository URL" MY_URL="http://www.example.com"; \
> > > >    perl -e 'while( <> ){s/%([_A-Z]+)%/$ENV{$1}/ge; print;}'
> > > >
> > > > An awk version requires a user-defined function:
> > > > function envsub(text,  chunk){
> > > >     while( match(text, /([^%]*)%([_A-Z]+)%(.*)/, chunk) != 0 ){
> > > >        sub( "%"chunk[2]"%", ENVIRON[chunk[2]], text )
> > > >     }
> > > >     return text;
> > > > }
> > > > { print envsub($0) }
> > > >
> > > > Wolfgang
> > > >
> > > >
> > > > On Fri, 3 Apr 2020 at 18:35, Vincent Férotin <address@hidden>
> > > wrote:
> > > >>
> > > >> Hi gawk maintainers!
> > > >>
> > > >> New to awk/gawk/mawk, I'd like to describe here what could possibly be
> > > a bug,
> > > >> at least a limitation, I encountered in these tools for my basic usage.
> > > >> Perhaps what follows is not a bug but a miscomprehension of
> > > me-as-newbee?
> > > >> Anyway, thanks in advance for reading this...
> > > >>
> > > >> V.F.
> > > >>
> > > >>
> > > >> TL;DR
> > > >> =====
> > > >>
> > > >> Using [gm]awk as a templating/macro engine, following shell commands
> > > >> do not output what could be expected:
> > > >>
> > > >>     $ echo "repository=%MY_URL%  # %COMMENT%" |COMMENT="Set repository
> > > >> URL" MY_URL="http://www.example.com"; awk '{print gensub(/%([_A-Z]+)%/,
> > > >> ENVIRON["\\1"], "g")}'
> > > >>     repository=  #
> > > >>
> > > >> or roughly equivalent:
> > > >>
> > > >>     $ echo "repository=MY_URL  # COMMENT" |COMMENT="Set repository
> > > >> URL" MY_URL="http://www.example.com"; awk '{gsub(/[_A-Z]+/,
> > > >> ENVIRON["&"]); print $0}'
> > > >>     repository=  #
> > > >>
> > > >> It seems that "\\1" of gensub() (or "&" for gsub()) is not well escaped
> > > >> with content providing from what regexp. captured, at least in the
> > > context
> > > >> of indexing ENVIRON. Expected output should be, IMHO and as far as I
> > > understand:
> > > >>
> > > >>     repository=http://www.example.com  # Set repository URL
> > > >>
> > > >>
> > > >> Versions tested
> > > >> ===============
> > > >>
> > > >> * gawk:
> > > >>   - 4.1.4 (Ubuntu 18.04 Bionic)
> > > >>   - 4.2.1 (Ubuntu 19.10 Eoan)
> > > >>   - 5.0.1 (Ubuntu 20.04 Focal)
> > > >> * mawk:
> > > >>   - 3.3 (Ubuntu 18.04 Bionic & 19.10 Eoan)
> > > >>   - 3.4.20200120 (Ubuntu 20.04 Focal)
> > > >>
> > > >>
> > > >> Usage
> > > >> =====
> > > >>
> > > >> In order to provision some virtual machine with Bash scripts,
> > > >> I used 'sed' for replacing some paths (string) or
> > > >> configuration file contents, but fail for some usages, where replaced
> > > string
> > > >> contains some chars. 'sed' could interpret as metachars (such as "/").
> > > >>
> > > >> I then tried using 'm4', where effective values to replace placeholders
> > > are
> > > >> available as environment variables.
> > > >> But Debian/Ubuntu packaging seems to have some limitations, notably by
> > > disabling
> > > >> '-W, --word-regexp=REGEXP' option (expected to allow setting
> > > >> placeholder regexp.,
> > > >> for e.g. "%([_A-Z]+)%").
> > > >> Using m4 as is, with its available configuration as chosen by
> > > >> packaging maintainers,
> > > >> is feasible:
> > > >>
> > > >>     $ echo "changecom\nrepository=MY_URL  # COMMENT" | m4
> > > >> -DMY_URL="$MY_URL" -DCOMMENT="$COMMENT"
> > > >>
> > > >>     repository=http://www.example.com  # Set repository URL
> > > >>
> > > >> but I miss choosing a more robust placeholder delimiters
> > > >> (I started here by pre- and suffixing them by "%",
> > > >> but I also could have chosen an other format, such as the more common
> > > "${var}").
> > > >>
> > > >> It seems that this need still exists outside my sole and naļve usage,
> > > >> see for example:
> > > >> -
> > > https://stackoverflow.com/questions/415677/how-to-replace-placeholders-in-a-text-file
> > > >> -
> > > https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash
> > > >>
> > > >> Note that, outside an alone answer (over a total of 40 (16+24 at time
> > > >> of this writing)):
> > > >> -
> > > https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash#answer-9590655
> > > >> no valid answer use awk or one of its derivates!
> > > >> (NB: This specific answer could probably suffice for my needs...)
> > > >>
> > > >>
> > > >> Evidences that `gensub(..., ENVIRON["\\1"])` should work
> > > >> ========================================================
> > > >>
> > > >> Using "\\1" in gensub() is well escaped:
> > > >>
> > > >>     $ echo "repository=%MY_URL%  # %COMMENT%" | awk '{print
> > > >> gensub(/%([_A-Z]+)%/, "( \\1 )", "g")}'
> > > >>     repository=( MY_URL )  # ( COMMENT )
> > > >>
> > > >> Passing directly desired var. name to ENVIRON also works:
> > > >>
> > > >>     $ echo "repository=%MY_URL%" |MY_URL="http://www.example.com"; awk
> > > >> '{print gensub(/%MY_URL%/, ENVIRON["MY_URL"], "g")}'
> > > >>     repository=http://www.example.com
> > > >>
> > > >>
> > > >> `ENVIRON` seems to not accept other expressions as index
> > > >> ========================================================
> > > >>
> > > >> Note also that trying to re-write awk script provided by above
> > > >> StackOverflow answer
> > > >> described in
> > > https://stackoverflow.com/questions/2914220/bash-templating-how-to-build-configuration-files-from-templates-with-bash#answer-9590655
> > > >> that is:
> > > >>
> > > >>     'match($0, "[$]{.*}") {var = substr($0, (RSTART + 2), (RLENGTH -
> > > >> 3)); gsub("[$]{"var"}", ENVIRON[var])}1'
> > > >>
> > > >> into more condensed and adapted to my use case:
> > > >>
> > > >>     '{gensub(/%([_A-Z]+)%/, ENVIRON[substr("\\1", 1, (length("\\1") -
> > > >> 2))])}'  # gawk
> > > >>     '{gsub(/%[_A-Z]+%/, ENVIRON[substr("&", 1, (length("&") - 1))]);
> > > >> print $0}'  # mawk
> > > >>
> > > >> does not work either.
> > > >>
> > > >>
> > > >> Search for previous existing occurrences of `gensub(...,
> > > ENVIRON["\\1"])`
> > > >> ========================================================================
> > > >>
> > > >> No occurrence of ``ENVIRON[`` with other type of index than plain
> > > >> string or variable
> > > >> were found in:
> > > >>
> > > >> * `sed and awk Pocket Reference` by Arnold Robbins (O'Reilly, 2002, 2nd
> > > ed.)
> > > >>     http://shop.oreilly.com/product/9780596003524.do
> > > >> * `sed & awk` by Dale Dougherty & Arnold Robbins (O'Reilly, 1997, 2nd
> > > ed.)
> > > >>     http://shop.oreilly.com/product/9781565922259.do
> > > >> * `Effective awk Programming` by Arnold Robbins (O'Reilly, 2015, 4th
> > > ed.)
> > > >>     http://shop.oreilly.com/product/0636920033820.do
> > > >> * `GNU awk - awesome one-liners` by Sundeep Agarwal (version 0.7)
> > > >>     https://learnbyexample.github.io/books/
> > > >>     (pointed recently in HackerNews:
> > > >> https://news.ycombinator.com/item?id=22758217 )
> > > >> * `bug-gawk` archives
> > > >>       https://lists.gnu.org/archive/html/bug-gawk/
> > > >>
> > >



reply via email to

[Prev in Thread] Current Thread [Next in Thread]