gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Patch Logs vs. character sets


From: Tom Lord
Subject: Re: [Gnu-arch-users] Patch Logs vs. character sets
Date: Tue, 25 May 2004 15:13:37 -0700 (PDT)

    > From: Aaron Bentley <address@hidden>

    > Tom Lord wrote:
    > > Aaron mentioned his belief that patch logs will eventually be UTF-8.

    > > I don't think so -- I think that would be a mistake.

    > It'd be nice to hear more about why you think that would be a
    > mistake. 

I try not to imagine the "arch user community" as one big
interconnected group of people.

Instead, I think there can be communities that are isolated from the
rest of us, but want to use arch.  A good example of this might be a
company using arch for development of some not-publicly-distributed
software.

Let's suppose we have such an isolated group and, for whatever reason, 
they all use computing systems where the character set is iso-8859-<N>
where <N> is greater than 1.

I think it's completely reasonable to require that their on-disk
patch-log format is iso-8859-<N>.   They can `cat' a log file and get
a reasonable result.

At the same time, let's suppose that something changes and they join
the rest of us.   Now they'll be working with people for whom the
default character set is _not_ iso-8859-<N>.

The solution I proposed for patch logs is aimed at handling that kind
of situation, and many similar situations, gracefully.


    > I'd rather have One True Encoding for patchlogs so I don't need to 
    > support Windows-1252 to see my win32 colleague's curly quotes.

Too bad.   Welcome to string processing in the 21st century.  Get used
to it.

Let's please try to minimize the threads where we try to redesign
Unicode and its subsets and extensions.  Those threads have been
played out gazillions of times.

The sad truth is, nowadays, strings are a lot harder than they were
30 years ago and, worse, there's all kinds of horrible and unsolvable
transcoding edge-cases.

Oh well, at least it gives us something to do.


    > Your later comments suggest that tla should support Unicode transcoding 
    > directly, so why not transcode the patchlog to utf-8 as part of the 
    > cooking process at commit time?

So that the on-disk format is one that makes the archive owner happy.

-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]