[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 t
From: |
Andrew J. Schorr |
Subject: |
Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016 |
Date: |
Wed, 16 Jun 2021 09:07:28 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi,
Please send the results from these commands then:
wc -l ParentChild.csv
gawk -f Emp_Attr.awk ParentChild.csv>Emp_Attr.csv
wc -l ParentChild.csv
gawk -v f2=Emp_Attr.csv -f map_attr.awk ParentChild.csv>Map_Attr.csv
wc -l ParentChild.csv Map_Attr.csv
TYPE map_attr.awk
I'm assuming that your environment has "wc" available in addition to gawk;
maybe that's a flawed assumption. If wc is not available, then you can
use gawk instead, depending on the level of quoting insanity in your shell,
like so:
gawk 'END {print FILENAME, FNR}' ParentChild.csv
gawk 'END {print FILENAME, FNR}' Map_Attr.csv
Regards,
Andy
On Wed, Jun 16, 2021 at 12:54:52PM +0000, Koleti, Haritha wrote:
> Sent too fast same result.
>
>
>
> From: Koleti, Haritha
> Sent: Wednesday, June 16, 2021 8:47 AM
> To: 'Andrew J. Schorr' <aschorr@telemetry-investments.com>; Ed Morton
> <mortoneccc@comcast.net>
> Cc: Pirane, Marco <Marco.Pirane@pseg.com>; bug-gawk@gnu.org; Pereira, Ricardo
> <Ricardo_D.Pereira@pseg.com>
> Subject: RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win
> 2008
> to Win 2016
>
>
>
> [cid]
>
>
>
>
>
> -----Original Message-----
> From: Andrew J. Schorr <aschorr@telemetry-investments.com>
> Sent: Wednesday, June 16, 2021 8:39 AM
> To: Ed Morton <mortoneccc@comcast.net>
> Cc: Koleti, Haritha <Haritha.Koleti@pseg.com>; Pirane, Marco
> <Marco.Pirane@pseg.com>; bug-gawk@gnu.org; Pereira, Ricardo
> <Ricardo_D.Pereira@pseg.com>
> Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win
> 2008
> to Win 2016
>
>
>
> ***CAUTION******CAUTION******CAUTION***This e-mail is from an EXTERNAL
> address. The actual sender is (aschorr@telemetry-investments.com) which may
> be different from the display address in the From: field. Be cautious of
> clicking on links or opening attachments. Suspicious? Report it via the Report
> Phishing button. On mobile phones, forward message to Cyber Security.
>
>
>
> Hi Ed,
>
>
>
> That sounds right to me. As you point out, map_attr.awk produces precisely one
> line of output for each line of input. So the command:
>
>
>
> gawk -v f2=Emp_Attr.csv -f map_attr.awk ParentChild.csv>Map_Attr.csv
>
>
>
> should produce a Map_Attr.csv file that has exactly the same number of records
> as the ParentChild.csv file. There must have been a cut & paste copy error.
>
>
>
> Haritha -- can you please try again, taking care to make sure that the command
> is copied exactly as written above?
>
>
>
> Regards,
>
> Andy
>
>
>
> On Wed, Jun 16, 2021 at 07:33:50AM -0500, Ed Morton wrote:
>
> > Given:
>
> >
>
> > yes Andy, original command is looking parentchild(195K) records in
> Emp_attr(5000) and creating MAP_attr.csv(195K) records.
>
> > versus below command with out pipe is looking for EMP_attr.csv(5000)
> against Parentchild(195K) and creating MAP_Attr.csv with 5000 records.
>
> >
>
> >
>
> > Sounds to me like that they ran the command with the input files in
>
> > the wrong order as the posted awk script will output the same number
>
> > of lines as are present in the input file pass in the args list so
>
> > it's impossible for the posted awk script to output some number of
>
> > lines other than are present in ParentChild.csv unless it aborts
>
> > mid-processing but then for it to output exactly the same number of
>
> > lines as are present in Emp_Attr.csv in that scenario seems.... unlikely!
>
> >
>
> > Ed.
>
> >
>
> > On 6/16/2021 7:19 AM, Andrew J. Schorr wrote:
>
> >
>
> > Hi,
>
> >
>
> > This makes no sense to me. The pure gawk version is simpler and cleaner
> without
>
> > the pipe. Are you sure that you copied the commands properly? Do any
> Windoze
>
> > folks have an idea of what could be going wrong here?
>
> >
>
> > Regards,
>
> > Andy
>
> >
>
> > On Wed, Jun 16, 2021 at 11:27:53AM +0000, Koleti, Haritha wrote:
>
> >
>
> > yes Andy, original command is looking parentchild(195K) records in
> Emp_attr
>
> > (5000) and creating MAP_attr.csv(195K) records.
>
> > versus below command with out pipe is looking for EMP_attr.csv(5000)
> against
>
> > Parentchild(195K) and creating MAP_Attr.csv with 5000 records.
>
> >
>
> > thank you!!
>
> > Haritha
>
> >
>
> >
>
> > -----Original Message-----
>
> > From: Andrew J. Schorr <aschorr@telemetry-investments.com>
>
> > Sent: Tuesday, June 15, 2021 2:14 PM
>
> > To: Koleti, Haritha <Haritha.Koleti@pseg.com>
>
> > Cc: Eli Zaretskii <eliz@gnu.org>; mortoneccc@comcast.net;
> arnold@skeeve.com;
>
> > wolfgang.laun@gmail.com; bug-gawk@gnu.org; Pereira, Ricardo
>
> > <Ricardo_D.Pereira@pseg.com>; Pirane, Marco <Marco.Pirane@pseg.com>
>
> > Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->
> from Win 2008
>
> > to Win 2016
>
> >
>
> > ***CAUTION******CAUTION******CAUTION***This e-mail is from an
> EXTERNAL address.
>
> > The actual sender is (aschorr@telemetry-investments.com) which may
> be
>
> > different from the display address in the From: field. Be cautious
> > of
> clicking
>
> > on links or opening attachments. Suspicious? Report it via the
> > Report
> Phishing
>
> > button. On mobile phones, forward message to Cyber Security.
>
> >
>
> > Hi,
>
> >
>
> > I'm not sure that I understand your message. Are you saying that you
> are
>
> > getting different results from:
>
> >
>
> > TYPE ParentChild.csv|gawk -f Emp_Attr.awk>Emp_Attr.csv TYPE
> ParentChild.csv|
>
> > gawk -v f2=Emp_Attr.csv -f map_attr.awk>Map_Attr.csv
>
> >
>
> > versus:
>
> >
>
> > gawk -f Emp_Attr.awk ParentChild.csv>Emp_Attr.csv gawk -v f2=
> Emp_Attr.csv -f
>
> > map_attr.awk ParentChild.csv>Map_Attr.csv
>
> >
>
> > ???
>
> >
>
> > Is the difference in Emp_Attr.csv or Map_Attr.csv or both?
>
> > Or am I confused about what you are indicating? These commands
> > should
> be
>
> > equivalent, and the latter versions should be faster, I would think.
> If you
>
> > additionally use Ed's modified version of map_attr.awk, you should
> get top
>
> > speed.
>
> >
>
> > Regards,
>
> > Andy
>
> >
>
> > On Tue, Jun 15, 2021 at 04:58:53PM +0000, Koleti, Haritha via Bug
> reports and
>
> > all discussion about gawk. wrote:
>
> >
>
> > it runs faster but the final file is not as expected it is
>
> > 192KB where
>
> >
>
> > original file should have been 16230KB.
>
> >
>
> > we are not getting right output that we require.
>
> >
>
> >
>
> >
>
> > [https://www.pseg.com/images/global/email/
>
> >
>
> >
>
> > PSEG_emailsignature_PSEGw-tag_version2.png]<http://www.pseg.com>
>
> >
>
> >
> > [https://urldefense.com/v3/__http://facebook.com/pseg__;!!ITzsDw!
>
> >
>
> > 822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjLIevkbLg$
> [facebook
>
> > [.]com]]<https://urldefense.com/v3/__http://www.facebook.com/
> pseg__;!!ITzsDw!
>
> > 822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjJOb1Po8w$
> [facebook
>
> > [.]com]> [Twitter] <https://urldefense.com/v3/__http://
> www.twitter.com/
>
> > psegdelivers__;!!ITzsDw!
>
> > 822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjI9yjTfPw$
> [twitter[.]
>
> > com]> [LinkedIn] <https://urldefense.com/v3/__http://
> www.linkedin.com/
>
> > company/pseg__;!!ITzsDw!
>
> > 822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjJPqAX0Zg$
> [linkedin
>
> > [.]com]> [https://www.pseg.com/images/global/WP_LOGOgrey.png]
> <https://
>
> > urldefense.com/v3/__https://urldefense.com/v3/__http://
> energizepseg.com/__;!!ITzsDw!__;!!ITzsDw!
> 501U94eYRfYHigfF9-mQoZCQplgIh_un4JPbJLOn_iwwgjkZL-yHjVZVFNqBLcr7rg$
> [energizepseg[.]com]
>
> > 822sQgC9LXZMAwCiYMZSwlyutaVquoyUSY4rouDADRSylfC9Vca7ScU4XjKCrSe70w$
>
> > [energizepseg[.]com]>
>
> >
>
> >
>
> > PSEGSC
>
> > -----Original Message-----
>
> > From: Eli Zaretskii <eliz@gnu.org>
>
> > Sent: Tuesday, June 15, 2021 11:33 AM
>
> > To: Koleti, Haritha <Haritha.Koleti@pseg.com>
>
> > Cc: mortoneccc@comcast.net; arnold@skeeve.com;
>
> > wolfgang.laun@gmail.com; bug-gawk@gnu.org; Pereira, Ricardo
>
> > <Ricardo_D.Pereira@pseg.com>; Pirane, Marco <
> Marco.Pirane@pseg.com>
>
> > Subject: Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6
> ->from
>
> > Win 2008 to Win 2016
>
> >
>
> > ***CAUTION******CAUTION******CAUTION***This e-mail is from
>
> > an EXTERNAL
>
> >
>
> > address. The actual sender is (eliz@gnu.org) which may be
> > different
> from the
>
> > display address in the From: field. Be cautious of clicking on links
> or opening
>
> > attachments. Suspicious? Report it via the Report Phishing button.
> On mobile
>
> > phones, forward message to Cyber Security.
>
> >
>
> > From: "Koleti, Haritha" <Haritha.Koleti@pseg.com>
>
> > CC: "wolfgang.laun@gmail.com" <wolfgang.laun@gmail.com>,
>
> > "bug-gawk@gnu.org"
>
> > <bug-gawk@gnu.org>,
>
> > "Pereira, Ricardo" <Ricardo_D.Pereira@pseg.com>,
>
> > "Pirane,
>
> > Marco" <Marco.Pirane@pseg.com>
>
> > Date: Tue, 15 Jun 2021 15:13:14 +0000
>
> >
>
> > This worked like a charm <1 minute. But we have 100s of
> scripts . if
>
> >
>
> > would really help if we can find a root
>
> >
>
> > cause why this 10 minutes versus 90 minutes.
>
> >
>
> > Try what Andrew suggested: eliminate the TYPE command and
>
> > the pipe from the
>
> >
>
> > batch file. Does that speed up the time, and if so, by how much?
>
> >
>
> > The information contained in this e-mail, including any
>
> > attachment(s), is
>
> >
>
> > intended solely for use by the named addressee(s). If you are not
> > the
> intended
>
> > recipient, or a person designated as responsible for delivering such
> messages
>
> > to the intended recipient, you are not authorized to disclose, copy,
> distribute
>
> > or retain this message, in whole or in part, without written
> authorization from
>
> > PSEG. This e-mail may contain proprietary, confidential or
> > privileged
>
> > information. If you have received this message in error, please
> notify the
>
> > sender immediately. This notice is included in all e-mail messages
> leaving
>
> > PSEG. Thank you for your cooperation.
>
> > The information contained in this e-mail, including any attachment
> (s), is
>
> > intended solely for use by the named addressee(s). If you are not
> > the
> intended
>
> > recipient, or a person designated as responsible for delivering such
> messages
>
> > to the intended recipient, you are not authorized to disclose, copy,
> distribute
>
> > or retain this message, in whole or in part, without written
> authorization from
>
> > PSEG. This e-mail may contain proprietary, confidential or
> > privileged
>
> > information. If you have received this message in error, please
> notify the
>
> > sender immediately. This notice is included in all e-mail messages
> leaving
>
> > PSEG. Thank you for your cooperation.
>
> >
>
> >
>
>
>
> --
>
> Andrew Schorr e-mail: aschorr@telemetry-investments.com
>
> Telemetry Investments, L.L.C. phone: 917-305-1748
>
> 152 W 36th St, #402 fax: 212-425-5550
>
> New York, NY 10018-8765
>
> The information contained in this e-mail, including any attachment(s), is
> intended solely for use by the named addressee(s). If you are not the intended
> recipient, or a person designated as responsible for delivering such messages
> to the intended recipient, you are not authorized to disclose, copy,
> distribute
> or retain this message, in whole or in part, without written authorization
> from
> PSEG. This e-mail may contain proprietary, confidential or privileged
> information. If you have received this message in error, please notify the
> sender immediately. This notice is included in all e-mail messages leaving
> PSEG. Thank you for your cooperation.
--
Andrew Schorr e-mail: aschorr@telemetry-investments.com
Telemetry Investments, L.L.C. phone: 917-305-1748
152 W 36th St, #402 fax: 212-425-5550
New York, NY 10018-8765
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, (continued)
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Eli Zaretskii, 2021/06/15
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Eli Zaretskii, 2021/06/15
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/15
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Andrew J. Schorr, 2021/06/15
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/16
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Andrew J. Schorr, 2021/06/16
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Ed Morton, 2021/06/16
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Andrew J. Schorr, 2021/06/16
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/16
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/16
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016,
Andrew J. Schorr <=
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Manuel Collado, 2021/06/16
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Eli Zaretskii, 2021/06/17
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, arnold, 2021/06/17
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Eli Zaretskii, 2021/06/17
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, arnold, 2021/06/17
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/17
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/17
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, arnold, 2021/06/17
- RE: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Koleti, Haritha, 2021/06/17
- Re: [EXTERNAL] Re: Performance issues using GAWK 3.1.6 ->from Win 2008 to Win 2016, Eli Zaretskii, 2021/06/14