guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode and Guile


From: Tom Lord
Subject: Re: Unicode and Guile
Date: Sat, 25 Oct 2003 17:03:18 -0700 (PDT)


    > From: Stephen Compall <address@hidden>


    > UTF-16 seems to offer the worst of
    > both worlds.
    [being both wide compared to 8-bit characters and involving
     variable length unicode character encodings, I presume.]


It's culturually discriminatory to regard utf-16 as worse than utf-8
in those regards.

Or, put differently, for many potential users, utf-16 is the best of
both worlds: it optimizes the size of the most common characters (for
some users), and it can also handle any Unicode character.



    > As for the semantics, I submit the way Emacs does it: node (elisp)Text
    > Representations, or
    > http://www.gnu.org/manual/elisp-manual-21-2.8/html_node/elisp_542.html

What do the index arguments to STRING-REF and STRING-SET refer to?
Byte positions or character positions?

(Personally, I think they refer to byte positions and that new errors
can result from them (if the index isn't at a character boundary).
(Too bad that (1+ index) no longer means (next-character string
index)).  There's a need for a new type, `text', which acts like the
text contents of an emacs buffer and has (yes I agree) pretty much the
Emacs interface.  It should all be designed so that, internally,
people can write new ways to represent text objects and multiple text
object representations can coexist in the same application (just like
emacs).  There's no good reason not to throw in attributes, overlays,
and markers for text objects too (just like emacs).  ("There's nothing
new under the sun.")   And, eventually, people should mostly stop
using the STRING? type altogether except internally to implementations
of TEXT? and as a way to represent non-textual strings of bytes.

"Everything is UTF-32" isn't going to be practical for a long time and
then, after it is, the first roughly homonoid space-aliens to show up
with news of a life-filled galaxy will mean UTF-32 won't be practical
all over again :-)

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]