Re: Unicode, ports and encoding

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode, ports and encoding

From:	Mike Gran
Subject:	Re: Unicode, ports and encoding
Date:	Tue, 17 Feb 2009 15:45:32 -0800 (PST)

 > From: Ludovic Courtès <address@hidden>
>> Mike Gran writes:

> >     This implies that a source code file should have syntax to
> >     indicate its own encoding, if it is not ASCII.  Something akin to
> >     the  line in HTML files.
> 
> One could imagine special treatment of, say, the first 10 lines of a
> file, with the ability to recognize Emacs file variables like
> "-*- coding: utf-8 -*-" and to change the current port transcoder
> accordingly, something like that.

Yeah.  Something like that.

> IIRC, the first step you suggested was the implementation of wide
> string/char types.  Did you also work on this?

Sort of.

I thought I could start there, but, it isn't easy. There is a lot that could
be broken by modifying string processing.  So I tried writing some tests 
first so I can check my work as I go along.  But the tests have to be
non-ASCII, so they need to be converted when they are read in.
It gets a little bit circular using scm_from_locale_string to convert
non-ASCII strings in the test source, and then having the test check
the behavior of scm_from_locale_string.

So, now I think a better route is to make some type of simplified
transcoded port system available to ports so that non-ASCII
tests are read in correctly.   From there, one can work up toward wide
strings and chars while checking work along the way.

Thanks,

Mike Gran

[Prev in Thread]

Current Thread

[Next in Thread]

Unicode, ports and encoding, Mike Gran, 2009/02/16
- Re: Unicode, ports and encoding, Ludovic Courtès, 2009/02/17
  - Re: Unicode, ports and encoding, Mike Gran <=
    - Re: Unicode, ports and encoding, Ludovic Courtès, 2009/02/18

Prev by Date: Re: Unicode, ports and encoding
Next by Date: Re: [VM] Large integers mishandled
Previous by thread: Re: Unicode, ports and encoding
Next by thread: Re: Unicode, ports and encoding
Index(es):
- Date
- Thread