[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] funcsub func needed
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] funcsub func needed |
Date: |
Sat, 13 Dec 2014 19:41:38 +0200 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hi.
I think you're missing my point. If you use a general regexp
and capture it, like so:
/^data/ {
$0 =
gensub(/\<(.+)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
print
next
}
Then you will probably save a lot of time and I hope accomplish your goal.
HTH,
Arnold
> From: "Kjetil Flovild-Midtlie" <address@hidden>
> To: Aharon Robbins <address@hidden>, <address@hidden>
> Date: Fri, 12 Dec 2014 13:25:41 +0100
> Subject: Re: [bug-gawk] funcsub func needed
>
> Thx.
> But the a b c d e f cannot be replaced w dot... They are actually 2300
> different combinations of sentences in a huge book text.
>
> Sorry for too simplyfied and unfinished sample
>
>
> K
>
> Sent with AquaMail for Android
> http://www.aqua-mail.com
>
>
> On 12. desember 2014 11:53:07 Aharon Robbins <address@hidden> wrote:
>
> > Hello.
> >
> > > # sample awk file, 1pass [auto-generated from another awk file + tsv
> > > datafile]
> > > #
> > > /^data/ { $0 =
> > > gensub(/\<a\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > > }
> > > /^data/ { $0 =
> > > gensub(/\<b\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > > }
> > > /^data/ { $0 =
> > > gensub(/\<c\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > > }
> > > /^data/ { $0 =
> > > gensub(/\<d\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > > }
> > > /^data/ { $0 =
> > > gensub(/\<e\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > > }
> > >
> > > # ..(2295 more) !!!
> >
> > It looks like rewriting this to a single rule:
> >
> > /^data/ { $0 =
> > gensub(/\<(.)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
> > }
> >
> > would save you considerable time; it would not suprise me if most of
> > the time was going into testing for a match and failing to find it.
> >
> > In addition if you have lots of other tests where once you've made the
> > substitution and you know you're done, doing something like
> >
> > /^data/ {
> > $0 =
> > gensub(/\<(.)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
> > print
> > next
> > }
> >
> > would save even more time.
> >
> > Hope this helps,
> >
> > Arnold
- Re: [bug-gawk] funcsub func needed, (continued)
Re: [bug-gawk] funcsub func needed, Kjetil Flovild-Midtlie, 2014/12/09
Re: [bug-gawk] funcsub func needed, Kjetil Flovild-Midtlie, 2014/12/09