liberty-eiffel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Libunicode


From: Eric Bezault
Subject: Re: Libunicode
Date: Fri, 7 Jan 2022 01:10:48 +0100
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1

On 07/01/2022 0:05, Paolo Redaelli wrote:
I also searched eiffel.org and gobo eiffel for some hints. Aren't they the "professionals"?

I think that neither is "professional". Only ISE and eiffel.com is.

Sure, eiffel.org is managed by people working at ISE, but it is supposed
to be a "community" platform.

As for Gobo Eiffel, it's no more professional than Liberty Eiffel.
I'm like you guys: I'm working on it during my spare time and it's free.
The two main differences with Liberty Eiffel is that Gobo Eiffel uses
the MIT license instead of the GNU license. And after having tried to
be usable with all available Eiffel compilers during more than a decade,
the aim of Gobo Eiffel is now to be fully compatible with ISE Eiffel.
I mean, one should be able to take a project developed with ISE Eiffel
and compile it with Gobo Eiffel without having to change a single line
of code. See https://www.youtube.com/watch?v=faF8p5Qnbeo&t=210s

That being said, here are some info about Unicode support in ISE Eiffel
and Gobo Eiffel. Gobo Eiffel was a pioneer in supporting Unicode in
Eiffel because it was needed to implement the XML library. This was
the early days of Gobo Eiffel, and at that time we had to support 4
different Eiffel compilers, with STRING as the only string support.
We wanted the XML library to be usable both with these STRING classes
and with some Unicode variants. So we had to hack a lot to come up
with a class UC_STRING conforming with STRING (all 4 versions of
STRING coming from the 4 Eiffel vendors). And UC_UTF8_STRING was
one implementation of UC_STRING. The goal was to later implement
UC_UTF16_STRING and UC_UTF32_STRING but this was never done. And there
are some helper classes such as UC_UTF8_ROUTINES and UC_UTF16_ROUTINES.
The implementation is in pure Eiffel (so that it could be used by
all Eiffel compilers supported at that time) and uses some little
tool to generate some of the classes from Unicode data files for
different Unicode versions. The code is here:
https://github.com/gobo-eiffel/gobo/tree/master/library/kernel/src/unicode
I don't think that it is used apart from the Gobo XML library.

Later, ISE Eiffel wanted to support Unicode for their EiffelVision
library. Not having to care about portability across different
Eiffel compilers, they decided to redesign the STRING class in the
kernel library. So now in EiffelBase they have the deferred class
STRING_GENERAL, with two ancestors STRING_8 (aka STRING) and
STRING_32. (There are also other variants for immutable strings.)
STRING_32 is what they use to support Unicode, in particular in
EiffelVision2. There is no encoding. It's merely a sequence of
32 bit code points. Then they have some helper classes such as
UTF_CONVERTER when they need to interact with the "outside world",
such as reading or writing to a file, interfacing with the
Windows API, etc. The classes are here:
https://github.com/EiffelSoftware/libraries/tree/master/Src/library/base/elks/kernel/string
and: https://github.com/EiffelSoftware/libraries/blob/master/Src/library/base/elks/kernel/utf_converter.e
Like in Gobo Eiffel, I think that everything is implemented in
pure Eiffel. Which means that if you don't use Unicode, the Eiffel
compiler will not include the code in the executable.
Now, I don't remember exactly how ISE Eiffel specifies manifest
strings, but I think that if "foo" only contains 8 bit characters
then its type will be STRING_8, and if it contains some unicode
characters such as "∃" then its type will be STRING_32. Also
class STRING_8 converts to class STRING_32, so if you have a
routine which expects a STRING_32, you can pass a STRING_8 and
it will be converted to a STRING_32. And finally, one can force
a manifest string to be of type STRING_32 with this notation:
{STRING_32} "foo".

There are surely better ways to support Unicode in Eiffel, but that's
the two existing implementations that I'm aware of. And they are
probably not complete. I guess that they implement only the parts
that were needed at the time they were developed.

I hope that you find the above useful, even if it's not a
"professional" documentation :-)

--
Eric Bezault
mailto:ericb@gobosoft.com
http://www.gobosoft.com



reply via email to

[Prev in Thread] Current Thread [Next in Thread]