bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: \c-handling in $'-strings


From: Chet Ramey
Subject: Re: \c-handling in $'-strings
Date: Mon, 31 Aug 2015 10:17:17 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.2.0

On 8/28/15 7:28 PM, Helmut Karlowski wrote:
> Hello
> 
> The bash-manual says:
> 
> Words of the form $'string' are treated specially.  The word expands to
> string, with backslash-escaped  characters replaced as specified by the
> ANSI C standard.  Backslash escape sequences, if present, are decoded as
> follows:
> 
> ...
> 
>  \cx    a control-x character
> 
> Now when I run this:
> 
> {
>   echo $LINENO $'\h\ca\ek'
>   echo $LINENO $'\h\cA\ek'
>   echo $LINENO $'\h\cd\ek'
>   echo $LINENO $'\h\c\d\ek'
>   echo $LINENO $'\h\c|d\ek'
>   echo $LINENO $'\h\c<d\ek'
>   echo $LINENO $'\h\c d\ek'
>   echo $LINENO $'\h\\c d\ek'
> } | tee /dev/stderr | od -ax
> 
> I get (output pasted from my editor):
> 
> 2 \h^A^[k
> 3 \h^A^[k
> 4 \h^D^[k
> 5 \h^\d^[k
> 6 \h^\d^[k
> 7 \h^\d^[k
> 8 \h
> 9 \h\c d^[k
> 0000000   2  sp   \   h soh esc   k  nl   3  sp   \   h soh esc   k  nl
>            2032    685c    1b01    0a6b    2033    685c    1b01    0a6b
> 0000020   4  sp   \   h eot esc   k  nl   5  sp   \   h  fs   d esc   k
>            2034    685c    1b04    0a6b    2035    685c    641c    6b1b
> 0000040  nl   6  sp   \   h  fs   d esc   k  nl   7  sp   \   h  fs   d
>            360a    5c20    1c68    1b64    0a6b    2037    685c    641c
> 0000060 esc   k  nl   8  sp   \   h  nl   9  sp   \   h   \   c  sp   d
>            6b1b    380a    5c20    0a68    2039    685c    635c    6420
> 0000100 esc   k  nl
>            6b1b    000a
> 0000103
> 
> I wonder about the lines 6, 7, 8: 6,7: all non-alnum-characters (here | and
> <) are printed as 0x1c?

Conversion to a control character is effected by ANDing with 0x1f, since
the valid control character range is 0-0x1f.  If you have something that's
not a valid control character after being ANDed with 0x1f, you get
undefined results.

There is a table in

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/stty.html#tag_20_123

that has the list of valid characters.

> And line 8: Why is the output truncated after '\c '?

Space is outside the range of a control character, and, as it happens,
<space>&0x1f == 0.  The NUL causes the string to be truncated.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]