monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: problems with i18n testsuite


From: Robert Bihlmeyer
Subject: [Monotone-devel] Re: problems with i18n testsuite
Date: Wed, 21 Apr 2004 21:20:25 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux)

graydon hoare <address@hidden> writes:

> Robert Bihlmeyer wrote:
>
>> I think the best solution is to assume UTF-8, and use LC_CTYPE's
>> charset in case the filename is not valid UTF-8.
>
[...]
>   - if I commit on an EUC-KR machine, the filename is not valid UTF-8;
>     but the filename is representable in UTF-8 if I do a conversion.

If your LC_CTYPE is something like kr_KR.EUC-KR my algorithm will work
in this case. If your LC_CTYPE is C or something else entirely, no
automatic guessing will do.

>   - if I checkout from monotone (UTF-8) to a EUC-KR machine, the
>     UTF-8 filename is not valid EUC-KR, but it is representable in
>     EUC-KR if I do a conversion.

I wasn't thinking of checkout yet. I have a weak preference for
defaulting to UTF-8 filenames.

[snip]

Basically, I want to make the point: the LC_CTYPE of your shell need
not match the charset of all your filenames, or the charset of all
your files' contents. And there is no other way to infer a "local
charset".

I'm still unclear on what you do with file content. Do you convert
from whatever you assume as the local charset to UTF-8 for storage and
hash computation? Wouldn't that fail horribly for non-text content?

I'd really like version control systems to get out of the text
conversion business. Either your editor handles that, or you hang
appropriate tools on pre-checkin and post-checkout hooks.

-- 
Robbe

Attachment: pgpaWp7rhMbAt.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]