emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode support for the MS Windows clipboard


From: Jason Rumney
Subject: Re: Unicode support for the MS Windows clipboard
Date: Thu, 27 May 2004 16:43:29 +0100
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040514

Benjamin Riefenstahl wrote:

Jason Rumney <address@hidden> writes:

If that is the case, it might be better to get rid of
w32-clipboard-type as a user variable, and determine the type
automatically from selection-coding-system instead. cp<900 should
map to OEM, utf16 to unicode, and others to ANSI.

Can we just assume this?  Does "cp<900" really garantee OEM?

Possibly Thai is an exception, and maybe Vietnamese, we can make exceptions where necessary, but basically all the OEM codepages that are not also used as ANSI codepages are in the sub 900 range. The DBCS codepages in the 900-1000 range are used for both ANSI and OEM codepages, so either CF_TEXT or CF_OEMTEXT would be valid for them, though CF_TEXT is probably more widely recognized.

How do we know that some exotic private trick coding system isn't usefull for
CF_UNICODETEXT

The encoding of CF_UNICODETEXT does not vary, so utf-16-le (or maybe -be) is the only coding-system that is appropriate. As mentioned, we could map other utf coding systems automatically onto the right one, to avoid the user having to know too many details.

> or for CF_OEMTEXT

CF_OEMTEXT is defined as the default console codepage for that version of Windows. Although it is theoretically possible for the user to have customized it beyond the limited set that come out of the box with different localised versions of Windows, it really isn't that interesting to us because other applications probably wouldn't support those non-default encodings either. I doubt there are many (if any) applications that support CF_OEMTEXT but not CF_TEXT, so it is probably better to just ignore it until someone comes up with a reason why we should support it.

Also, we should set (and read) CF_LOCALE when we are using CF_TEXT,
to indicate the coding we have used.

I'll have to look that up, I'm not familiar with CF_LOCALE.

I think the problem I had with that was finding a locale given an ANSI codepage. In the case where CF_LOCALE is the default system locale, we don't need to set it, and in other cases we would be better using CF_UNICODETEXT, so maybe this is not worth pursuing.

When reading from the clipboard, if CF_UNICODE is present, it might
be better to use that (ignoring selection-coding-system).

Could we get into trouble with the MULE problem here?  Or does
unify-8859-on-{en,de}coding solve this for all cases?

On the other hand, some Chinese characters are still not covered by
Emacs' unicode support (even with utf-translate-cjk-mode), [...]
Big5 is definitely not entirely covered).

If those characters are not supported by Unicode, how does Windows
support them, which is based on Unicode after all?  Does it support
them at all?  Or does it use the private characters for this?

Maybe I am imagining a problem that is not there. Having checked again, it seems the problem I saw with a character not being displayed, which I thought was due to an unsupported character was actually due to a character (in Chinese Traditional text) being decoded as japanese-jisx0212 which I don't have a font for. I can still see this being a major problem for Chinese users though.


  character: [] (0254137, 88159, 0x1585f, U+4F60)
    charset: japanese-jisx0212 (JISX0212 Japanese supplement: ISO-IR-159.)
 code point: 48 95
     syntax: w  which means: word
category: C:Chinese (Han) characters of 2-byte character sets j:Japanese
             |:While filling, we can break a line at this character.
buffer code: 0x94 0xB0 0xDF
  file code: not encodable by coding system mule-utf-8-dos
    display: no font available


PS: The default for selection-coding-system should be cpXXXX-dos, not
just cpXXXX.  Otherwise I get <LF> as line ends instead of <CR><LF>
when I copy non-ASCII text.  Which than doesn't work well with
Notepad, of course.

Thanks, the code to make sure selection-coding-system was dos seems to have been removed in my previous changes.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]