bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12803: 24.3.50; accented Thai Unicode characters are turned into dec


From: Peter Dyballa
Subject: bug#12803: 24.3.50; accented Thai Unicode characters are turned into decomposed ones on Mac OS X by replace-regexp
Date: Sun, 4 Nov 2012 23:35:58 +0100

Hello!

I wanted to get the unique Thai characters from such an eMail subject:

        FW:grcthai สร้างรายได้แบบไร้ขีดจำกัด กับการทำงานแบบไร้ขอบเขต..

So I marked the Thai text and invoked replace-regexp with "\(.\)" -> ”\1 " to 
later do replace-string " " -> "C-qC-j" and then [g]sort -u the result. I had 
in buffer *Shell Command Output* decomposed Thai Unicode characters…

But actually it is already the function replace-regexp which produces the 
decomposed characters (originally 41 characters, after replace-regexp not 82 
but 89 according to column-number-mode).

Mac OS X 10.6.8; the fonts used are FreeSerif for the Thai characters,  George 
Williams' Monospace Regular is used for SPACE. The result is the same when I 
use GTK2 and it also make no difference when I use a native 64-bit binary (and 
libs).


In GNU Emacs 24.3.50.1 (i386-apple-darwin10.8.0, X toolkit, Xaw3d scroll bars)
 of 2012-11-04 on Sumac.local
Bzr revision: 110798 eggert@cs.ucla.edu-20121104172952-vvhdy8gmbtgj0c3w
Windowing system distributor `The X.Org Foundation', version 11.0.11300000
Configured using:
 `configure '--build=x86_64-apple-darwin10.8.0'
 '--host=i386-apple-darwin10.8.0' '--target=i386-apple-darwin10.8.0'
 '--without-pop' '--without-sound' '--without-gpm' '--without-dbus'
 '--without-selinux' '--with-x-toolkit=athena'
 '--disable-ns-self-contained' '--without-xpm' '--without-jpeg'
 '--without-tiff' '--without-gif' '--without-png'
 '--x-libraries=/usr/X11/lib' '--x-includes=/usr/X11/include'
 '--enable-locallisppath=/Library/Application
 Support/Emacs/calendar24:/Library/Application Support/Emacs'
 'CFLAGS=-g3 -H -pipe -fPIC -fno-common -Os -march=core2 -mtune=core2
 -m32 -fomit-frame-pointer -msse4.2' 'LDFLAGS=-m32
 -Wl,-dead_strip_dylibs -Wl,-bind_at_load -Wl,-t'
 'CPPFLAGS=-I/sw/include' 'CC=clang' 'CXX=clang++'
 
'PKG_CONFIG_PATH=/sw/lib/xft2/lib/pkgconfig:/sw/share/pkgconfig:/sw/lib/pkgconfig:/usr/X11/lib/pkgconfig:/usr/X11/share/pkgconfig:/usr/lib/pkgconfig'
 'build_alias=x86_64-apple-darwin10.8.0'
 'host_alias=i386-apple-darwin10.8.0'
 'target_alias=i386-apple-darwin10.8.0''

Important settings:
  value of $LC_CTYPE: de_DE.UTF-8
  value of $LANG: de_DE.UTF-8
  locale-coding-system: utf-8-unix
  default enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<down-mouse-1> <mouse-1> <help-echo> <down-mouse-1> 
<mouse-1> <down-mouse-2> <mouse-2> <down-mouse-1> <mouse-1> 
<backspace> C-a <escape> x r e p l <tab> r e g <tab> 
<return> \ ( . \ ) <return> \ 1 SPC <return> M-x c 
o l <tab> <return> C-a C-u C-x = <right> C-u C-x = 
<right> C-u C-x = <right> <right> C-u C-x = <help-echo> 
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo> 
<help-echo> <help-echo> <menu-bar> <help-menu> <send-emacs-bug-report>

Recent messages:
Replaced 48 occurrences
Column-Number mode enabled
Type C-x 1 to delete the help window, C-M-v to scroll help.
Char: ส (3626, #o7052, #xe2a, file ...) point=192 of 287 (67%) column=0

Char: SPC (32, #o40, #x20) point=193 of 287 (67%) column=1

Char: ร (3619, #o7043, #xe23, file ...) point=194 of 287 (67%) column=2

Char: ้ (3657, #o7111, #xe49, file ...) point=196 of 287 (68%) column=4

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml
mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils pp wid-edit descr-text help-mode easymenu
cus-start cus-load thai-util thai-word mule-util time-date tooltip
ediff-hook vc-hooks lisp-float-type mwheel x-win x-dnd tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment lisp-mode
register page menu-bar rfn-eshadow timer select scroll-bar mouse
jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer loaddefs button faces cus-face macroexp files text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dynamic-setting
system-font-setting font-render-setting x-toolkit x multi-tty emacs)


--
Greetings

  Pete

The problem with the French is that they don't have a word for « entrepreneur ».
                                – Georges W. Bush






reply via email to

[Prev in Thread] Current Thread [Next in Thread]