bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] funcsub func needed


From: Aharon Robbins
Subject: Re: [bug-gawk] funcsub func needed
Date: Sat, 13 Dec 2014 19:41:38 +0200
User-agent: Heirloom mailx 12.5 6/20/10

Hi.

I think you're missing my point.  If you use a general regexp
and capture it, like so:

/^data/ {
  $0 = 
gensub(/\<(.+)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
  print
  next
}

Then you will probably save a lot of time and I hope accomplish your goal.

HTH,

Arnold

> From: "Kjetil Flovild-Midtlie" <address@hidden>
> To: Aharon Robbins <address@hidden>, <address@hidden>
> Date: Fri, 12 Dec 2014 13:25:41 +0100
> Subject: Re: [bug-gawk] funcsub func needed
>
> Thx.
> But the a b c d e f cannot be replaced w dot...  They are actually 2300 
> different combinations of sentences in a huge book text.
>
> Sorry for too simplyfied and unfinished sample
>
>
> K
>
> Sent with AquaMail for Android
> http://www.aqua-mail.com
>
>
> On 12. desember 2014 11:53:07 Aharon Robbins <address@hidden> wrote:
>
> > Hello.
> >
> > > # sample awk file, 1pass [auto-generated from another awk file + tsv
> > > datafile]
> > > #
> > > /^data/ { $0 =
> > > gensub(/\<a\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > >  }
> > > /^data/ { $0 =
> > > gensub(/\<b\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > >  }
> > > /^data/ { $0 =
> > > gensub(/\<c\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > >  }
> > > /^data/ { $0 =
> > > gensub(/\<d\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > >  }
> > > /^data/ { $0 =
> > > gensub(/\<e\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\2","g")
> > >  }
> > >
> > > # ..(2295 more) !!!
> >
> > It looks like rewriting this to a single rule:
> >
> > /^data/ { $0 = 
> > gensub(/\<(.)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
> >  }
> >
> > would save you considerable time; it would not suprise me if most of
> > the time was going into testing for a match and failing to find it.
> >
> > In addition if you have lots of other tests where once you've made the
> > substitution and you know you're done, doing something like
> >
> > /^data/ {
> >     $0 = 
> > gensub(/\<(.)\s([[:digit:]])\s([[:alpha:]])/,"/address@hidden@address@hidden@\\3","g")
> >     print
> >     next
> > }
> >
> > would save even more time.
> >
> > Hope this helps,
> >
> > Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]