--- Begin Message ---
Subject: |
23.1.50; \u and \x in string |
Date: |
Mon, 02 Nov 2009 00:31:17 -0500 |
"\ue1" gives the error "Non-hex digit used for Unicode escape".
Why doesn't it work to give the Unicode character á?
Note that \xe1 does not work for this any more.
It gives a different character, which displays as \341 and
is described as follows by C-x =.
Char: \341 (4194273, #o17777741, #x3fffe1, raw-byte) point=442 of 2980 (15%)
column=0
That too is confusing, and certainly not documented clearly where \x
is explained. Is there any way to specify unicode e1 with \x?
In GNU Emacs 23.1.50.4 (mipsel-unknown-linux-gnu, GTK+ Version 2.12.12)
of 2009-08-11 on theobromine2
configured using `configure 'CFLAGS=-O0 -g -Wno-pointer-sign'
'mipsel-unknown-linux-gnu' 'build_alias=mipsel-unknown-linux-gnu'
'host_alias=mipsel-unknown-linux-gnu' 'target_alias=mipsel-unknown-linux-gnu''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: RMAIL Edit
Minor modes in effect:
shell-dirtrack-mode: t
diff-auto-refine-mode: t
gpm-mouse-mode: t
display-battery-mode: t
tooltip-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
abbrev-mode: t
Recent input:
b R TAB RET ESC < C-u C-n C-u C-u C-n C-u C-n C-n C-n
C-n C-f 4 b o u t C-_ C-x b o u t - 2 2 RET C-a C-p
C-x 4 b R TAB RET C-u ESC x c o m p a r e RET C-x o
C-x o C-x b RET C-b C-b C-b C-b | ESC C-x C-x C-s C-x
b RET C-x o C-b C-b C-x ESC ESC ESC p ESC p RET C-x
o C-x o C-x o C-x C-g C-x 4 b RET C-a ESC f C-f C-@
ESC C-f ESC w ESC : C-y RET C-x o ESC : ( l o o k i
n g - a t SPC C-y ) RET C-x o C-e ESC b ESC d 2 4 0
ESC C-x C-x o ESC : ESC p RET C-x = C-x o o C-_ C-x
o ESC : ESC p C-e ESC DEL ESC DEL ESC DEL " \ 2 4 0
DEL DEL DEL x a 0 " ) RET C-u C-x = C-\ a ' C-g e C-x
= C-f a ' C-b C-x = ESC : ESC p C-e C-b C-b ESC DEL
DEL C-\ a ' C-e RET C-x = ESC : ESC p C-e C-b C-b DEL
\ 3 4 1 RET C-x = ESC : ESC p C-e C-b C-b DEL DEL DEL
x e 1 RET C-x = ESC : ESC p C-e C-b C-b C-b C-b DEL
u C-e RET ESC : ESC p C-e C-b C-b C-b C-b ESC u C-e
RET ESC : ESC p C-e C-b C-b C-b C-b 0 0 C-e RET ESC
x r e p o r t SPC e m a c s SPC b u g RET
Recent messages:
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
t
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
nil
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
nil
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
let: Non-hex digit used for Unicode escape [2 times]
t
Source file `/home/rms/emacs-cvs/lisp/mail/emacsbug.el' newer than
byte-compiled file
Load-path shadows:
None found.
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#4848: 23.1.50; \u and \x in string |
Date: |
Mon, 13 Jun 2016 22:45:33 -0400 |
"Non-ASCII In Strings" now (24.5) says the following which explains
about "\xN" producing unibyte characters.
You can also use hexadecimal escape sequences (‘\xN’) and octal
escape sequences (‘\N’) in string constants. *But beware:* If a string
constant contains hexadecimal or octal escape sequences, and these
escape sequences all specify unibyte characters (i.e., less than 256),
and there are no other literal non-ASCII characters or Unicode-style
escape sequences in the string, then Emacs automatically assumes that it
is a unibyte string. That is to say, it assumes that all non-ASCII
characters occurring in the string are 8-bit raw bytes.
--- End Message ---