bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [address@hidden: gtk2, iso14755, pasting non-ascii characters, and t


From: Kenichi Handa
Subject: Re: [address@hidden: gtk2, iso14755, pasting non-ascii characters, and the x-windows clipboard]
Date: Thu, 18 Dec 2003 20:28:21 +0900 (JST)
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.3 (sparc-sun-solaris2.6) MULE/5.0 (SAKAKI)

In article <3FE17861.9090809@uni-bonn.de>, josh buhl <uzs33d@uni-bonn.de> 
writes:

> Kenichi Handa wrote:
>>> However, the gtk2 apps and the non-gtk2 apps aside from emacs, all
>>> seem to be able to paste this text in from each other properly. Only
>>> emacs has this problem.
>>  
>>  Perhaps, that because the other apps use UTF8_STRING request
>>  on selection (which is XFree86 extention) but Emacs 21.3
>>  uses only COMPOUND_TEXT request (standard of X).  The latest
>>  CVS version of Emacs supports UTF8_STRING.

> That sounds plausible. If I tried to checkout and compile the latest cvs 
> of emacs to test this, would I have to somehow enable utf8_string, or 
> would it be automatically supported?

In CVS Emacs, we introduced this variable.

----------------------------------------------------------------------
x-select-request-type's value is nil

*Data type request for X selection.
The value is nil, one of the following data types, or a list of them:
  `COMPOUND_TEXT', `UTF8_STRING', `STRING', `TEXT'

If the value is nil, try `COMPOUND_TEXT' and `UTF8_STRING', and
use the more appropriate result.  If both fail, try `STRING', and
then `TEXT'.

If the value is one of the above symbols, try only the specified
type.

If the value is a list of them, try each of them in the specified
order until succeed.
----------------------------------------------------------------------

As the default is still nil, Emacs tries both COMPOUND_TEXT
and UTF8_STRING.  And to decide "the more appropriate
result", we currently do this:

;; Helper function for x-selection-value.  Select UTF8 or CTEXT
;; whichever is more appropriate.  Here, we use this heurisitcs.
;;
;;   (1) If their lengthes are different, select the longer one.  This
;;   is because an X client may just cut off unsupported characters.
;;
;;   (2) Otherwise, if the Nth character of CTEXT is an ASCII
;;   character that is different from the Nth character of UTF8,
;;   select UTF8.  This is because an X client may replace unsupported
;;   characters with some ASCII character (typically ` ' or `?') in
;;   CTEXT.
;;
;;   (3) Otherwise, select CTEXT.  This is because legacy charsets are
;;   better for the current Emacs, especially when the selection owner
;;   is also Emacs.

But, considering the described behaviour of gtk2, it seems
that we should test (2) at first.

>>  ???  Then, in what locale were you running gtk2 apps when
>>  pasting didn't work?

> The system default, which is no default language (as recommended during 
> the debian locales configuration script for mult-language systems), so 
> just POSIX:

I see.  I suspect that gtk2 produces \x{...} in
COMPOUND_TEXT encoder because latin-1 accented letters are
not supported in that locale.

[...]

> But like I said, I can open a terminal, set LC_ALL=en_US.utf8, start 
> emacs, and the pasting does not work (but only for emacs, it still works 
> with other apps). *HOWEVER*, if I log out, select any of the available 
> locales for the session language in the gdm login, e.g. de_DE.ISO-8859-1 
> or en_US.UTF-8, and then login, then all the pasting works properly.

> I suppose that the session locale setting might also alter the way the X 
> selection buffer deals with the marked text.

Perhaps.  As the selection owner has no way to know in which
locale a selection requester is running, it is likely that
the gtk2 assumes that the requester is in the session
locale.

>>> The garbaged text corresponds exactly to the unicode hex encodings for
>>> the characters. for example the unicode hex encoding of ß is 00DF and
>>> emacs displays the pasted in ß as \x{00DF}. This certainly isn't a 
>>> coincidence.
>>  
>>  
>>  Emacs never generates such \x{.....} notation automatically.
>>  So, the text should be generated on sender site.

> This corroborates the suggestion that the session locale setting is also 
> effecting the text in the x selection buffer. But there's still the 
> question (except for your utf8-string explanation) of why other apps can 
> insert this, but emacs can't.

As I wrote, I think they request UTF8_STRING at first, and
UTF8 encoder always encode all characters correctly
regardless of the current locale.

---
Ken'ichi HANDA
handa@m17n.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]