[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: string-map arg order
From: |
Dirk Herrmann |
Subject: |
Re: string-map arg order |
Date: |
Wed, 5 Sep 2001 22:10:54 +0200 (MEST) |
On 4 Sep 2001, Alex Shinn wrote:
> >>>>> "Dirk" == Dirk Herrmann <address@hidden> writes:
>
> Dirk> Further, you would not start by making everything utf-32.
> Dirk> Rather, you would start with a 1-byte width and only
> Dirk> increase width as necessary, which is at most 2 times:
> Dirk> 1->2->4. With a variable width encoding, you may have to
> Dirk> increase the size (n * (m-1)) times, n being the string
> Dirk> length, m being the maximum character width. Further,
>
> In the context of multi-threading, I'm not sure resizing is even an
> option. For whatever API we choose, ultimately external C library
> functions will be given a pointer to the characters (char or wchar) of
> a string. If we reallocate the string from another thread, that
> pointer will then be invalid.
Resizing is an option if you make sure that no memory region that is being
used gets freed. That's the reason why I introduced the separation of a
memory-region object and the string objects that use it. See
http://mail.gnu.org/pipermail/guile-devel/2000-November/000586.html
> An alternative implementation is to always allocate 4 bytes per
> character (with *either* fixed- or variable-byte), and expand in place
> as needed. Why work with single-byte strings in 4x the space? So
> that you don't have to convert when passing to C functions. What
> steered me away most from fixed-width encodings is coming up with a
> decent API. The rest of the world (other languages, GTK, FreeType,
> Linux itself) are moving to utf8 - if we choose another encoding,
> we'll have to convert data types back and forth constantly. And the
> possibility of different string types or wide strings means all
> current extensions would have to update to the new API right away -
> with utf8 they're safe so long as they stick to ASCII, and could
> upgrade at their leisure.
I don't understand this argument: As long as you stick to ASCII, the
fixed-width strings would also remain as they ever were.
However, if the rest of the world actually uses utf8, then this is an
argument in favour of using it. Still, I assume major performance
drawbacks.
Best regards
Dirk Herrmann
- Re: string-map arg order, Dirk Herrmann, 2001/09/03
- Re: string-map arg order, Alex Shinn, 2001/09/03
- Re: string-map arg order, Dirk Herrmann, 2001/09/03
- Re: string-map arg order, Alex Shinn, 2001/09/03
- Re: string-map arg order, Dirk Herrmann, 2001/09/04
- Re: string-map arg order, Alex Shinn, 2001/09/04
- Re: string-map arg order,
Dirk Herrmann <=
- Re: string-map arg order, Alex Shinn, 2001/09/05
- Re: string-map arg order, Dirk Herrmann, 2001/09/06
- Re: string-map arg order, Alex Shinn, 2001/09/06
- Re: string-map arg order, Dirk Herrmann, 2001/09/06
- Re: string-map arg order, Alex Shinn, 2001/09/06
- Re: string-map arg order, Dirk Herrmann, 2001/09/06
- Re: string-map arg order, Alex Shinn, 2001/09/06
- Re: string-map arg order, Dirk Herrmann, 2001/09/06
- Re: string-map arg order, Marius Vollmer, 2001/09/05
- Re: string-map arg order, Dirk Herrmann, 2001/09/06