bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RS='.^' apparently ignores the RS setting


From: Ed Morton
Subject: RS='.^' apparently ignores the RS setting
Date: Mon, 12 Jul 2021 17:19:18 -0500
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

We were just having a conversation on comp.unix.shell about various RS settings and someone pointed out that you should be able to use `RS='$^'` to read a file because a start-of-string anchor can never appear after an end-of-string anchor (I mentioned `RS='^$'` as IMHO a better existing alternative).

At first I thought `$^` just meant that pair of literal characters but apparently the meaning of `$^` differs between BREs and EREs because:

    $ echo 'a$^b' | grep '$^'
    a$^b

    $ echo 'a$^b' | grep -E '$^'
    $

    $ echo 'a$^b' | sed 's/$^/X/'
    aXb

    $ echo 'a$^b' | sed -E 's/$^/X/'
    a$^b

So I read https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_09 which I thought was saying you could use any character before the `^` and it wouldn't match which was supported by this test:

    $ printf 'ax^b\nax^b\n' | awk 'BEGIN{RS="x^"}{print NR, $0}'
    1 ax^b
    ax^b

but then I can't explain this where gawk is apparently completely ignoring the RS setting:

    $ printf 'a.^b\na.^b\n' | awk 'BEGIN{RS=".^"}{print NR, $0}'
    1 a.^b
    2 a.^b

Is that a bug?

$ gawk --version
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)

Regards,

    Ed.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]