--- Begin Message ---
Subject: |
[PATCH]: echo,printf,stat: Allow only up to 8 bit octal input for backslash-escaped chars |
Date: |
Mon, 06 Dec 2010 17:34:03 +0100 |
Hi,
as reported in RHBZ#660033
( https://bugzilla.redhat.com/show_bug.cgi?id=660033 ), echo, printf and
stat allows 3 octal digits without limitation to 8-bit.
Documentation(manpages, info) refers to "byte with octal value" or
"8-bit octal value". Therefore 9-bit octal values should not be allowed.
Especially in echo, only unsigned char is used for storing this octal
number, so 9-bit values overflow.
I see two ways of fixing : a) change documentation (informing only 1-3
octal digits input, no "8-bit" or "byte" words)
b) accept only up to 8-bit octals
Because of the unsigned char overflow, I prefer the 8-bit limit - and I
did so in attached patch. As I don't expect this will be noticed by
anyone (probably most of users already limit these octals to 8-bit
independently), I didn't added NEWS entry. Test testing printf '\0610'
output is added. Previously it was interpreted as 392 and this was
passed to putchar(), after the patch it is interpreted as '\061' + '0'
=> 10 .
I have also added missing \NNN GNU extension to --help output of echo.
Greetings,
Ondrej Vasik
escaped-ninebit-octal-char.patch
Description: Text Data
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#7574: [PATCH]: echo, printf, stat: Allow only up to 8 bit octal input for backslash-escaped chars |
Date: |
Fri, 14 Jan 2011 15:47:13 +0100 |
Ondrej Vasik wrote:
> As the same bash request for change in builtin echo and printf
> (http://lists.gnu.org/archive/html/bug-bash/2010-12/msg00030.html and
> https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tpl&source=L&listname=austin-group-l&id=15087
> ) was rejected, I think we should do the same here to keep echo and
> printf implementations as close as possible.
Keeping echo implementations in sync (between coreutils and shells)
is desirable, but as you imply below, keeping tools consistent, e.g.,
in how they handle an argument specified like \400 is important, too.
Note that GNU tr interprets \400 like this:
$ tr '\400' x
tr: warning: the ambiguous octal escape \400 is being
interpreted as the 2-byte sequence \040, 0
That does not match how printf interprets '\400':
$ printf '\400' | od -An -c
\0
> Anyway, it would be better to be consistent in all utilities -e.g.
> tr.c:502 now behaves the way proposed in the patch in this bugzilla -
> and at least document that the ninth bit is ignored in the address@hidden
> section of info documentation.
Good idea.
Patch below.
I'm marking this issue as "done", but feel free to open-and-retitle or
simply to file a new bug if you want to pursue this.
>From 63fdb7c671cd0c05ba1b1710d0922016ce687362 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Fri, 14 Jan 2011 15:45:58 +0100
Subject: [PATCH] doc: document how printf treats e.g., \400
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* doc/coreutils.texi (printf invocation): Document that any
ninth bit in \OOO is ignored. Suggested by Ondřej Vašík in
http://debbugs.gnu.org/7574
---
doc/coreutils.texi | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 85d5201..a51af26 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -11195,9 +11195,12 @@ printf invocation
@kindex address@hidden
@kindex address@hidden
@command{printf} interprets @address@hidden in @var{format} as an octal number
-(if @var{ooo} is 1 to 3 octal digits) specifying a character to print,
+(if @var{ooo} is 1 to 3 octal digits) specifying a byte to print,
and @address@hidden as a hexadecimal number (if @var{hh} is 1 to 2 hex
digits) specifying a character to print.
+Note however that when @address@hidden specifies a number larger than 255,
+the ninth bit is ignored. For example, @samp{printf '\400'} is equivalent
+to @samp{printf '\0'}.
@kindex \uhhhh
@kindex \Uhhhhhhhh
--
1.7.3.5
--- End Message ---