bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 t


From: Ed Morton
Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016
Date: Wed, 16 Jun 2021 07:33:50 -0500
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

Given:
yes Andy, original command is looking parentchild(195K) records in 
Emp_attr(5000) and creating MAP_attr.csv(195K) records.
versus below command with out pipe is looking for EMP_attr.csv(5000) against 
Parentchild(195K) and creating MAP_Attr.csv with 5000 records.

Sounds to me like that they ran the command with the input files in the wrong order as the posted awk script will output the same number of lines as are present in the input file pass in the args list so it's impossible for the posted awk script to output some number of lines other than are present in ParentChild.csv unless it aborts mid-processing but then for it to output exactly the same number of lines as are present in Emp_Attr.csv in that scenario seems.... unlikely!

    Ed.

On 6/16/2021 7:19 AM, Andrew J. Schorr wrote:
Hi,

This makes no sense to me. The pure gawk version is simpler and cleaner without
the pipe. Are you sure that you copied the commands properly? Do any Windoze
folks have an idea of what could be going wrong here?

Regards,
Andy

On Wed, Jun 16, 2021 at 11:27:53AM +0000, Koleti, Haritha wrote:
yes Andy, original command is looking parentchild(195K) records in Emp_attr
(5000) and creating MAP_attr.csv(195K) records.
versus below command with out pipe is looking for EMP_attr.csv(5000) against
Parentchild(195K) and creating MAP_Attr.csv with 5000 records.

thank you!!
Haritha


-----Original Message-----
From: Andrew J. Schorr <aschorr@telemetry-investments.com>
Sent: Tuesday, June 15, 2021 2:14 PM
To: Koleti, Haritha <Haritha.Koleti@pseg.com>
Cc: Eli Zaretskii <eliz@gnu.org>; mortoneccc@comcast.net; arnold@skeeve.com;
wolfgang.laun@gmail.com; bug-gawk@gnu.org; Pereira, Ricardo
<Ricardo_D.Pereira@pseg.com>; Pirane, Marco <Marco.Pirane@pseg.com>
Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008
to Win 2016

***CAUTION******CAUTION******CAUTION***This e-mail is from an EXTERNAL address.
  The actual sender is  (aschorr@telemetry-investments.com) which may be
different from the display address in the From: field. Be cautious of clicking
on links or opening attachments. Suspicious? Report it via the Report Phishing
button.  On mobile phones, forward message to Cyber Security.

Hi,

I'm not sure that I understand your message. Are you saying that you are
getting different results from:

TYPE  ParentChild.csv|gawk -f Emp_Attr.awk>Emp_Attr.csv TYPE  ParentChild.csv|
gawk -v f2=Emp_Attr.csv -f map_attr.awk>Map_Attr.csv

versus:

gawk -f Emp_Attr.awk ParentChild.csv>Emp_Attr.csv gawk -v f2=Emp_Attr.csv -f
map_attr.awk ParentChild.csv>Map_Attr.csv

???

Is the difference in Emp_Attr.csv or Map_Attr.csv or both?
Or am I confused about what you are indicating? These commands should be
equivalent, and the latter versions should be faster, I would think. If you
additionally use Ed's modified version of map_attr.awk, you should get top
speed.

Regards,
Andy

On Tue, Jun 15, 2021 at 04:58:53PM +0000, Koleti, Haritha via Bug reports and
all discussion about gawk. wrote:
it runs faster but the final file is not as expected it is 192KB where
original file should have been 16230KB.
we are not getting right output that we require.



[https://www.pseg.com/images/global/email/
PSEG_emailsignature_PSEGw-tag_version2.png]<http://www.pseg.com>
[https://urldefense.com/v3/__http://facebook.com/pseg__;!!ITzsDw!
822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjLIevkbLg$ [facebook
[.]com]]<https://urldefense.com/v3/__http://www.facebook.com/pseg__;!!ITzsDw!
822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjJOb1Po8w$ [facebook
[.]com]>        [Twitter] <https://urldefense.com/v3/__http://www.twitter.com/
psegdelivers__;!!ITzsDw!
822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjI9yjTfPw$ [twitter[.]
com]>         [LinkedIn] <https://urldefense.com/v3/__http://www.linkedin.com/
company/pseg__;!!ITzsDw!
822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjJPqAX0Zg$ [linkedin
[.]com]>       [https://www.pseg.com/images/global/WP_LOGOgrey.png] <https://
urldefense.com/v3/__http://energizepseg.com/__;!!ITzsDw!
822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjKCrSe70w$
[energizepseg[.]com]>

PSEGSC
-----Original Message-----
From: Eli Zaretskii <eliz@gnu.org>
Sent: Tuesday, June 15, 2021 11:33 AM
To: Koleti, Haritha <Haritha.Koleti@pseg.com>
Cc: mortoneccc@comcast.net; arnold@skeeve.com;
wolfgang.laun@gmail.com; bug-gawk@gnu.org; Pereira, Ricardo
<Ricardo_D.Pereira@pseg.com>; Pirane, Marco <Marco.Pirane@pseg.com>
Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from
Win 2008 to Win 2016

***CAUTION******CAUTION******CAUTION***This e-mail is from an EXTERNAL
address.  The actual sender is  (eliz@gnu.org) which may be different from the
display address in the From: field. Be cautious of clicking on links or opening
attachments. Suspicious? Report it via the Report Phishing button.  On mobile
phones, forward message to Cyber Security.
From: "Koleti, Haritha" <Haritha.Koleti@pseg.com>
CC: "wolfgang.laun@gmail.com" <wolfgang.laun@gmail.com>,
         "bug-gawk@gnu.org"
<bug-gawk@gnu.org>,
         "Pereira, Ricardo" <Ricardo_D.Pereira@pseg.com>,
         "Pirane,
  Marco" <Marco.Pirane@pseg.com>
Date: Tue, 15 Jun 2021 15:13:14 +0000

This worked like a charm <1 minute.  But we have  100s of scripts .   if
would really help if we can find a root
cause why this 10 minutes versus 90 minutes.
Try what Andrew suggested: eliminate the TYPE command and the pipe from the
batch file.  Does that speed up the time, and if so, by how much?
The information contained in this e-mail, including any attachment(s), is
intended solely for use by the named addressee(s). If you are not the intended
recipient, or a person designated as responsible for delivering such messages
to the intended recipient, you are not authorized to disclose, copy, distribute
or retain this message, in whole or in part, without written authorization from
PSEG. This e-mail may contain proprietary, confidential or privileged
information. If you have received this message in error, please notify the
sender immediately. This notice is included in all e-mail messages leaving
PSEG. Thank you for your cooperation.
The information contained in this e-mail, including any attachment(s), is
intended solely for use by the named addressee(s). If you are not the intended
recipient, or a person designated as responsible for delivering such messages
to the intended recipient, you are not authorized to disclose, copy, distribute
or retain this message, in whole or in part, without written authorization from
PSEG. This e-mail may contain proprietary, confidential or privileged
information. If you have received this message in error, please notify the
sender immediately. This notice is included in all e-mail messages leaving
PSEG. Thank you for your cooperation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]