[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: I'm looking for a method of converting a string's character encoding

From: Daniel Krueger
Subject: Re: I'm looking for a method of converting a string's character encoding
Date: Mon, 30 Apr 2012 12:18:59 +0200

On Sat, Apr 28, 2012 at 10:55 PM, Eli Zaretskii <address@hidden> wrote:
> One notable example is when the original encoding was determined
> incorrectly, and the application wants to "re-decode" the string, when
> its external origin is no longer available.

Okay, but then I would suggest either if you know you're probably not
getting the right encoding but can determine it later to only store
the input as a bytevector and later decode it correctly. Or if you
already have the string you could encode it back to a bytevector with
the wrong guessed encoding (which should emit the original input I
think) and then re-decode it with the right encoding. Wouldn't that be
the same solution as adding a primitive which does the same thing but
on some lower level?

> Another example is an
> application that wants to convert an encoded string into base-64 (or
> similar) form -- you'll need to encode the string internally first.

Here I don't have enough experience, but wouldn't you then just again
transform the string into a bytevector and further work with it?

> IOW, Guile needs a way to represent a string encoded in something
> other than UTF-8, and convert between UTF-8 and other encodings.

I think strings should be encoding `independent', so you don't have to
mind that if you don't need to, and if you're working with a special
encoding you're working on a representation of the `text' as a number
of characters encoded in some numbers, so you use a bytevector.

The only thing I'm not sure about is whether guile supports encoding a
string (into a bytevector) in some other format than UTF-8, so if
there don't exist other procedures I would suggest adding a string to
bytevector decoder which takes an encoder and the encoders (or just
procedures which convert the string directly into a bytevector in a
specific encoding).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]