|
From: | David Lamkins |
Subject: | Re: [Bug-apl] How do I convert a byte sequence to Unicode? |
Date: | Sun, 27 Apr 2014 22:24:40 -0700 |
To convert byte values to code points, you need to apply an encoding algorithm, and that's kind of messy.(I believe the rest of GNU APL kind of assumes that UTF-8 is the standard encoding used, which does make things simpler).I have a suggestion: Make ⎕UCS support a dyadic form where the left-hand side specifies the encoding to use. I.e:'UTF-8' ⎕UCS 99 100 101 102Handling multiple encodings is easily done through the libiconv library. I worked on it when I made some improvements to its Common Lisp integration. It's quite simple to use.Regards,EliasOn 28 April 2014 12:49, David B. Lamkins <address@hidden> wrote:
That's close, but libfileio[8] returns a sequence of byte values; not
code points.
On Mon, 2014-04-28 at 12:19 +0800, Elias Mårtenson wrote:
> Use the quad function ⎕UCS:
>
>
> ⎕UCS 'foo⍉bar'
> 102 111 111 9033 98 97 114
> ⎕UCS 102 111 111 9033 98 97 114
> foo⍉bar
>
>
> Regards,
> Elias
>
>
> On 28 April 2014 12:17, David B. Lamkins <address@hidden> wrote:
> I can use lib_file_io to read a sequence of byte values from a
> file
> containing Unicode text.
>
> How do I convert that sequence back to a Unicode string in GNU
> APL?
>
>
>
>
>
[Prev in Thread] | Current Thread | [Next in Thread] |