[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] PEG: utf8
[Gzz] PEG: utf8
Wed, 14 May 2003 11:12:21 +0300
PEG src_utf8--tjl: UTF8 as our global encoding in all sources
:Author: Tuomas J. Lukka
:Last-Modified: $Date: 2003/05/14 08:08:11 $
:Revision: $Revision: 1.1 $
We are having lots of issues with docutils because we use
Latin-1 encoding in our files to write e.g. "Jyväskylä".
We're forward-looking in most of the other stuff we do.
I suggest that we do the same in this matter: We should
do the right thing and never look back.
I propose that we agree that the days of Latin-1 are
past and move everything we do to UTF-8.
- Will this be a problem with email? E.g. posting PEGs...
RESOLVED: Maybe, but not an important one. It's
easy enough to read the few garbled symbols if there are any,
and the important thing is that in CVS, things will work.
This OTOH gives some incentive to start thinking about UTF-8
- Isn't UTF8 difficult to edit?
RESOLVED: No, not any more. Both emacs and vim support it.
It's steadily gaining ground.
- Can we use UTF-8 with TeX? If not, what do we do?
RESOLVED: Doesn't seem to be possible, but we can use
the TeX escapes::
\"a, \"o ...
to handle this without breaking the high-bit rule.
Besides, our use of TeX directly is on the way out.
- Are you serious about using smiley faces or other special
unicode characters in identifiers?
RESOLVED: Yes, occasionally, if they can help. However,
much care is needed; never choose a character that looks
like some other one. For instance, 2133 (SCRIPT CAPITAL M)
is useless here.
In all ff subprojects, convert all files containing high-bit
characters (e.g. ä,ö) to UTF-8 encoding. (Including PEGs
like this one)
Explain this in README, along with instructions for the most
popular editors on how what to do.
Start using smiley faces as characters in Java identifiers ;)
Create a grep script which sniffs out Latin-1 ä, ö, Ä, Ö from new files.
Re: [Gzz] PEG: utf8, Alatalo Toni, 2003/05/14
Re: [Gzz] PEG: utf8, Benja Fallenstein, 2003/05/14
- [Gzz] PEG: utf8,
Tuomas Lukka <=