[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] second go at i18n spec
From: |
graydon hoare |
Subject: |
Re: [Monotone-devel] second go at i18n spec |
Date: |
Tue, 09 Dec 2003 11:08:24 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031115 Thunderbird/0.3 |
Christof Petig wrote:
Original idea:
----->8-------
If it would be not too much hassle the possibility to specify a
different encoding for filenames than for file content would save me
much trouble. [e.g. my toolkit uses UTF-8 for strings and my shell still
likes ISO or vice versa.*)] But then the encoding for a Makefile might
differ from a program source file => encoding per file? *2)
----->8-------
oh, sorry, I guess I wasn't clear enough: I meant to support exactly
what you suggest. under the proposal, filenames are subject to
conversion to a normal form, but the normal form only applies to data
*inside* monotone (when calculating SHA1 values, xdeltas, etc). file
names in the working copy will be written to the file system using the
"system encoding", and file data may be subject to *any* conversion in
and out of the database, not related to the "system encoding".
the only reason for normalization on filenames is that monotone reads
and interprets the manifest file, so it must know what the character set
is. so I convert all filename character codes to UTF-8 before letting
monotone read them.
so, to elaborate: there are 2 hooks you write (with analagous hooks for
line-ending conversion).
-- this is used to map normalized (internal -- UTF-8) filenames to and
-- from your filesystem. the UTF-8 side of the conversion is fixed, by
-- monotone, but this only applies to path names.
function system_charset()
return "ISO-8859-1"
end
-- this is a per-file transformation, probably left blank or
-- returning nil (meaning "leave the file contents alone") but
-- possibly used to massage character codes to and from your
-- preferred forms for editing and for storage
function charconv(filename)
if (string.find(filename, "%.java$")) then
-- store java files as UTF-8 in monotone, check out as ISO
return {"UTF-8", "ISO-8859-1"}
end
if (string.find(filename, "%.cbl.jp$")) then
-- keep japanese cobol stuff in EBCDIC, check out as UCS-2
return {"EBCDIC-JP-KANA", "UCS-2"}
end
-- otherwise leave the file alone
return nil
end