[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Emacs Lisp's future
From: |
David Kastrup |
Subject: |
Re: Emacs Lisp's future |
Date: |
Sat, 27 Sep 2014 10:49:37 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) |
"Stephen J. Turnbull" <address@hidden> writes:
> Eli Zaretskii writes:
> > > Date: Fri, 26 Sep 2014 18:45:54 +0400
> > > From: Dmitry Antipov <address@hidden>
> > > Cc: address@hidden
> > >
> > > Why not just use ICU?
> >
> > Emacs needs to be able to extend the Unicode code-point space for raw
> > 8-bit bytes and for a couple of character sets that are not unified.
>
> No, you don't. There's plenty of private space for those purposes
> (unless you know of private character sets that use more than two
> whole planes?) Emacs would simply use an indirect representation for
> private space. (That is, code points in private space are not
> necessarily identical to the input code points, but rather are indexes
> into an auxiliary table which implements the disjoint sum of the
> private code spaces in use.)
>
> Since this is private space, you need to build a table of attributes
> for these characters (I/O representation, UCD properties, glyphs, etc)
> anyway. For Unicode input using private space, you just record that
> as the I/O representation.
>
> > Can ICU support that?
>
> Maybe it would be unhappy if you used a lone surrogate representation
> (or other representation using integers outside of the Unicode
> character space) for those "extended code points", but as proposed
> above you can efficiently use private space in practice.
Except that Emacs, as an editor, needs to support the private spaces
users might want to use. Hijacking the surrogates is a reasonable
compromise. Another would have been hijacking the 4-byte encodable code
space beyond Unicode character 1114111 that is outside of UTF-8 but
inside of the coding scheme's logic and thus working equally well for
string manipulations: however, that would cause unencodable bytes to
take up more space. I think LuaTeX may use that strategy.
Being an editor, Emacs has to be more circumspect than most other
encoding-sensitive applications about what it may work with since
everything that is "private" may well be within the range that a user
wants to be able to put into string literals.
--
David Kastrup
- Re: Emacs Lisp's future, (continued)
- Re: Emacs Lisp's future, Stefan Monnier, 2014/09/27
- Re: Emacs Lisp's future, David Kastrup, 2014/09/27
- Re: Emacs Lisp's future, Stefan Monnier, 2014/09/27
- Re: Emacs Lisp's future, Richard Stallman, 2014/09/28
- Re: Emacs Lisp's future, Stefan Monnier, 2014/09/28
- Re: Emacs Lisp's future, Taylan Ulrich Bayirli/Kammer, 2014/09/27
- Re: Emacs Lisp's future, Robin Templeton, 2014/09/27
- Re: Emacs Lisp's future, David Kastrup, 2014/09/28
- Re: Emacs Lisp's future, Stephen J. Turnbull, 2014/09/27
- Re: Emacs Lisp's future, K. Handa, 2014/09/29
- Re: Emacs Lisp's future,
David Kastrup <=
- Re: Emacs Lisp's future, David Kastrup, 2014/09/26
- Re: Emacs Lisp's future, Stephen J. Turnbull, 2014/09/27
- Re: Emacs Lisp's future, David Kastrup, 2014/09/27
- Re: Emacs Lisp's future, Stephen J. Turnbull, 2014/09/27
Emacs Lisp's future (was: Guile emacs thread (again)), Nic Ferrier, 2014/09/17