[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Possible incorrect behaviour of AWK
From: |
Phil J FIsher |
Subject: |
RE: Possible incorrect behaviour of AWK |
Date: |
Tue, 17 Mar 2020 15:23:05 -0000 |
Ah-ha, stupid of me to not recall that :-(.
And yes, the '$' would be a newline character here ...
Many thanks and apologies for the disturbance.
Phil
(I did get it to work by doing my usual trick of using a sed pipeline first
to replace the desired record indicator with a newline,newline combination
which we know works)
-----Original Message-----
From: Andrew J. Schorr <address@hidden>
Sent: 17 March 2020 15:13
To: Phil J FIsher <address@hidden>
Cc: address@hidden
Subject: Re: Possible incorrect behaviour of AWK
Hi,
On Tue, Mar 17, 2020 at 02:18:06PM -0000, Phil J FIsher wrote:
> $ LC_ALL=C awk --posix 'BEGIN {RS="^---$";ORS="\n\n"} {print $0,RS,NR}'
I think the problem here may be the anchor metacharacters. From the docs:
https://www.gnu.org/software/gawk/manual/html_node/gawk-split-records.html
NOTE: Remember that in `awk', the `^' and `$' anchor
metacharacters match the beginning and end of a _string_, and not
the beginning and end of a _line_. As a result, something like
`RS = "^[[:upper:]]"' can only match at the beginning of a file.
This is because `gawk' views the input file as one long string
that happens to contain newline characters. It is thus best to
avoid anchor metacharacters in the value of `RS'.
When you say "$", do you actually mean "\n", for example?
Regards,
Andy