emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#49925: closed (cat -E interprets sentinel newline at the end of buff


From: GNU bug Tracking System
Subject: bug#49925: closed (cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r)
Date: Sat, 07 Aug 2021 18:30:01 +0000

Your message dated Sat, 7 Aug 2021 19:29:06 +0100
with message-id <14378fe4-0b51-fa6c-b060-2ad5bc5d719f@draigBrady.com>
and subject line Re: bug#49925: cat -E interprets sentinel newline at the end 
of buffer as an actual newline after a \r
has caused the debbugs.gnu.org bug report #49925,
regarding cat -E interprets sentinel newline at the end of buffer as an actual 
newline after a \r
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
49925: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=49925
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r Date: Sat, 7 Aug 2021 15:07:32 +0200 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:92.0) Gecko/20100101 Thunderbird/92.0a1
Hi,

after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html (unreleased), the behavior of cat -E was changed so that it prints "^M$" for "\r\n" line endings.

Whenever it sees a \r "cat -E" checks if the byte after is a \n, however that \n might be the sentinel value that is inserted at the end of a buffer.

This is a problem in two cases:

- When a \r is at the end of the input. `printf "\r" | cat -E` will print "^M", even though there is no "\n" after the "\r". FWIW, tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think that's wrong.

- When the file is too big to fit into one buffer. If you try to "cat -E" a big file (mutliple megabytes) that consists of only "\r", cat will print a few "^M" whenever it hits the end of a buffer in the middle of the file and at the end.

Michael




--- End Message ---
--- Begin Message --- Subject: Re: bug#49925: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r Date: Sat, 7 Aug 2021 19:29:06 +0100 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0
On 07/08/2021 14:07, Michael Debertol wrote:
Hi,

after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html
(unreleased), the behavior of cat -E was changed so that it prints "^M$"
for "\r\n" line endings.

Whenever it sees a \r "cat -E" checks if the byte after is a \n, however
that \n might be the sentinel value that is inserted at the end of a buffer.

This is a problem in two cases:

- When a \r is at the end of the input. `printf "\r" | cat -E` will
print "^M", even though there is no "\n" after the "\r". FWIW,
tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think
that's wrong.

This was intentional (as per the test) as I was thinking
we can provide more info here in the edge case that \r is the last char of a 
file.
However it's incorrect as you suggest, as cat can't treat files independently.

- When the file is too big to fit into one buffer. If you try to "cat
-E" a big file (mutliple megabytes) that consists of only "\r", cat will
print a few "^M" whenever it hits the end of a buffer in the middle of
the file and at the end.

That indeed is a bug.

So we need to track handling of \r across buffer and file boundaries.
The attached does that, and I'll apply later.

marking this as done,

thanks!
Pádraig

Attachment: cat-E-trailing-CR.patch
Description: Text Data


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]