|
From: | GNU bug Tracking System |
Subject: | bug#49925: closed (cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r) |
Date: | Sat, 07 Aug 2021 18:30:01 +0000 |
Your message dated Sat, 7 Aug 2021 19:29:06 +0100 with message-id <14378fe4-0b51-fa6c-b060-2ad5bc5d719f@draigBrady.com> and subject line Re: bug#49925: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r has caused the debbugs.gnu.org bug report #49925, regarding cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r to be marked as done. (If you believe you have received this mail in error, please contact help-debbugs@gnu.org.) -- 49925: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=49925 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems
--- Begin Message ---Subject: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r Date: Sat, 7 Aug 2021 15:07:32 +0200 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:92.0) Gecko/20100101 Thunderbird/92.0a1 Hi,after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html (unreleased), the behavior of cat -E was changed so that it prints "^M$" for "\r\n" line endings.Whenever it sees a \r "cat -E" checks if the byte after is a \n, however that \n might be the sentinel value that is inserted at the end of a buffer.This is a problem in two cases:- When a \r is at the end of the input. `printf "\r" | cat -E` will print "^M", even though there is no "\n" after the "\r". FWIW, tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think that's wrong.- When the file is too big to fit into one buffer. If you try to "cat -E" a big file (mutliple megabytes) that consists of only "\r", cat will print a few "^M" whenever it hits the end of a buffer in the middle of the file and at the end.Michael
--- End Message ---
--- Begin Message ---Subject: Re: bug#49925: cat -E interprets sentinel newline at the end of buffer as an actual newline after a \r Date: Sat, 7 Aug 2021 19:29:06 +0100 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 On 07/08/2021 14:07, Michael Debertol wrote:Hi, after https://lists.gnu.org/archive/html/coreutils/2021-02/msg00003.html (unreleased), the behavior of cat -E was changed so that it prints "^M$" for "\r\n" line endings. Whenever it sees a \r "cat -E" checks if the byte after is a \n, however that \n might be the sentinel value that is inserted at the end of a buffer. This is a problem in two cases: - When a \r is at the end of the input. `printf "\r" | cat -E` will print "^M", even though there is no "\n" after the "\r". FWIW, tests/misc/cat-E.sh expects a "^M" for a trailing "\r", but I think that's wrong.This was intentional (as per the test) as I was thinking we can provide more info here in the edge case that \r is the last char of a file. However it's incorrect as you suggest, as cat can't treat files independently.- When the file is too big to fit into one buffer. If you try to "cat -E" a big file (mutliple megabytes) that consists of only "\r", cat will print a few "^M" whenever it hits the end of a buffer in the middle of the file and at the end.That indeed is a bug. So we need to track handling of \r across buffer and file boundaries. The attached does that, and I'll apply later. marking this as done, thanks! Pádraigcat-E-trailing-CR.patch
Description: Text Data
--- End Message ---
[Prev in Thread] | Current Thread | [Next in Thread] |