[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode, ports and encoding
From: |
Mike Gran |
Subject: |
Re: Unicode, ports and encoding |
Date: |
Tue, 17 Feb 2009 15:45:32 -0800 (PST) |
> From: Ludovic Courtès <address@hidden>
>> Mike Gran writes:
> > This implies that a source code file should have syntax to
> > indicate its own encoding, if it is not ASCII. Something akin to
> > the line in HTML files.
>
> One could imagine special treatment of, say, the first 10 lines of a
> file, with the ability to recognize Emacs file variables like
> "-*- coding: utf-8 -*-" and to change the current port transcoder
> accordingly, something like that.
Yeah. Something like that.
> IIRC, the first step you suggested was the implementation of wide
> string/char types. Did you also work on this?
Sort of.
I thought I could start there, but, it isn't easy. There is a lot that could
be broken by modifying string processing. So I tried writing some tests
first so I can check my work as I go along. But the tests have to be
non-ASCII, so they need to be converted when they are read in.
It gets a little bit circular using scm_from_locale_string to convert
non-ASCII strings in the test source, and then having the test check
the behavior of scm_from_locale_string.
So, now I think a better route is to make some type of simplified
transcoded port system available to ports so that non-ASCII
tests are read in correctly. From there, one can work up toward wide
strings and chars while checking work along the way.
Thanks,
Mike Gran