emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs rewrite in a maintainable language


From: David Kastrup
Subject: Re: Emacs rewrite in a maintainable language
Date: Sat, 17 Oct 2015 17:38:22 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux)

address@hidden (Ludovic Courtès) writes:

> Guile strings are fine, thank you.  I’ve used a bunch of
> language/environments and honestly, I’m definitely not ashamed of what
> Guile provides, contrary to what David and you seem to imply.

It's quite irrelevant whether somebody is proud or ashamed rightfully or
not of what is there.  It does not fit the bill for Emacs and can't be
coaxed into fitting the bill without substantial changes.  Its character
range is hard-limited at the end of Unicode.  It has no representation
for non-utf8 code bytes.  It has no additional character codes available
for those (let alone the "overlong" 2-byte sequences that Emacs employs
for representing raw bytes 128-255 transparently) It cannot faithfully
represent a string where only parts are proper utf-8 and the rest is to
be reproduced exactly even when doing string operations on the utf-8
parts.  Its input- and output encodings are not independent from the
locale.

The underlying string handling libraries are not even _prepared_ to deal
with such requirements.  Nor are they prepared for passing such strings
transparently through and working with them.  In addition, Guile does
not actually have utf-8 strings but represents strings as either 8-bit
or 32-bit fixed-width entities.  Which makes indexing efficient but
incurs conversions even if you are just working with utf-8 like Emacs
does.

In spite of only working with either UCS-8 or UCS-32 in its strings,
many of the primitives of GUILE have a preference for utf-8: you cannot
even pass UCS-32 into GUILE or get it out easily in spite of it being
its internal format.

But when GUILE is used as an extension language rather than as a sole
implementation language, you'll need to pass strings in and out of it
constantly.  And the only format where strings will get passed without
conversion is Latin-1.  But when telling GUILE that everything is
Latin-1, you have only a limited amount of reasonably working string
operations at your disposal and you won't get to see character codes
larger than 255.

All this most definitely is nothing that could not be fixed.  But doing
that is comparatively harder when all attempts to do so or to bring the
problems to attention are met with defiance.

-- 
David Kastrup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]