RE: Possible incorrect behaviour of AWK

bug-gawk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Possible incorrect behaviour of AWK

From:	Phil J FIsher
Subject:	RE: Possible incorrect behaviour of AWK
Date:	Tue, 17 Mar 2020 15:23:05 -0000

Ah-ha, stupid of me to not recall that :-(.
And yes, the '$' would be a newline character here ...

Many thanks and apologies for the disturbance.

Phil
(I did get it to work by doing my usual trick of using a sed pipeline first
to replace the desired record indicator with a newline,newline combination
which we know works)

-----Original Message-----
From: Andrew J. Schorr <address@hidden> 
Sent: 17 March 2020 15:13
To: Phil J FIsher <address@hidden>
Cc: address@hidden
Subject: Re: Possible incorrect behaviour of AWK

Hi,

On Tue, Mar 17, 2020 at 02:18:06PM -0000, Phil J FIsher wrote:
> $ LC_ALL=C awk --posix 'BEGIN {RS="^---$";ORS="\n\n"}  {print $0,RS,NR}'

I think the problem here may be the anchor metacharacters. From the docs:

https://www.gnu.org/software/gawk/manual/html_node/gawk-split-records.html

     NOTE: Remember that in `awk', the `^' and `$' anchor
     metacharacters match the beginning and end of a _string_, and not
     the beginning and end of a _line_.  As a result, something like
     `RS = "^[[:upper:]]"' can only match at the beginning of a file.
     This is because `gawk' views the input file as one long string
     that happens to contain newline characters.  It is thus best to
     avoid anchor metacharacters in the value of `RS'.

When you say "$", do you actually mean "\n", for example?

Regards,
Andy

[Prev in Thread]

Current Thread

[Next in Thread]

Possible incorrect behaviour of AWK, Phil J FIsher, 2020/03/17
- Re: Possible incorrect behaviour of AWK, Andrew J. Schorr, 2020/03/17
  - RE: Possible incorrect behaviour of AWK, Phil J FIsher <=

Prev by Date: Re: Possible incorrect behaviour of AWK
Next by Date: How to deal with ASCII delimited text
Previous by thread: Re: Possible incorrect behaviour of AWK
Next by thread: How to deal with ASCII delimited text
Index(es):
- Date
- Thread