bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gsub() is very slow in gawk 5.1.0


From: Wolfgang Laun
Subject: Re: gsub() is very slow in gawk 5.1.0
Date: Thu, 15 Jul 2021 09:02:52 +0200

I've just posted this on stackoverflow:

function rec(n){
    if( n in a ) return a[n];
    switch( n ){
    case 0:
       return a[0] = "";
    case 1:
       return a[1] = "x";
    default:
       return a[n] = rec( int(n/2) ) rec( n - int(n/2));
    }
}
BEGIN {
    rec( 100000000 );
}

Times (using time) comparing similar function with a loop and gsub are:
   loop 0m5,265s
   gsub 0m7,565s
   recursive   0m0,095s




On Thu, 15 Jul 2021 at 08:42, <arnold@skeeve.com> wrote:

> Hi Ed.
>
> Ed Morton <mortoneccc@comcast.net> wrote:
>
> > I just tried the same script on my Mac using BSD awk 20200816 and it
> > only took 1.4 seconds to run. Unfortunately I can't install gawk or any
> > other awk on that machine to test with but I 100% believe the 2 other
> > people who posted at https://stackoverflow.com/a/68371463/1745001
> saying
> > gawk 5.1.0 on their Macs took 23.5 secs and almost 30 secs respectively.
>
> Once again, you have to compare apples to apples. Part of it is
> definitely related to how much RAM you have. I bet that Mac of
> yours has 32 Gig or more on it.
>
> On my personal 8 Gig system, I had to kill all other awks.  My work laptop
> (Ubuntu 18.04) has 16 Gig. Here's the data:
>
> $ cat t2.awk
> BEGIN {
>         s=sprintf("%*s",1000000000,""); gsub(/ /,"x",s)
> }
>
> $ ./nawk --version
> awk version 20210215
>
> $ time ./nawk -f t2.awk
>
> real    2m2.270s
> user    0m12.061s
> sys     1m50.162s
>
> $ time ./gawk -f t2.awk
>
> real    3m8.238s
> user    3m6.167s
> sys     0m1.856s
>
> Gawk is 50% slower than nawk, but not 10 or 15 times slower.
> The gawk regex routines are much more heavy-weight than what's
> in nawk.  And no, I can't substitute in some other regex library.
>
> Interestingly:
>
> $ (export LC_ALL=C ; time ./gawk -f t2.awk)
>
> real    2m30.100s
> user    2m28.561s
> sys     0m1.484s
>
> So we see that gawk is comparable to nawk when told to not
> worry about multibyte locales.
>
> I think we can put this to rest now.
>
> Thanks,
>
> Arnold
>
>

-- 
Wolfgang Laun


reply via email to

[Prev in Thread] Current Thread [Next in Thread]