bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnulib] quote characters in stds


From: Paul Eggert
Subject: Re: [bug-gnulib] quote characters in stds
Date: Wed, 08 Jun 2005 13:55:30 -0700
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.4 (gnu/linux)

Bruno Haible <address@hidden> writes:

> Five years ago, people made up lists of programs that _do_ work with UTF-8
> encoded text files. Today, these programs are uncountable. Instead, people
> make up lists of programs that _don't_ work with Unicode:
>   http://www.freedesktop.org/wiki/Software_2fBadSoftware

Well, to be fair, lots of programs work OK with UTF-8 in common cases,
but mess up when they're thown something "hard".  Certainly the list
you referred to is woefully incomplete.

I read your email containing accented letters with GNU Emacs 21.4 and
Gnus 5.10.6, a combination that supports UTF-8.  But I didn't see the
accented letters correctly on my screen: I saw "?" instead.  This is
because I was logged in via an ssh xterm window from Debian woody,
whose xterm doesn't support UTF-8.  Now that Debian sarge is stable I
will look into switching to a better xterm, but this will take some of
my time (I'm not looking forward to upgrading all my machines to
sarge....) and even then I'm not sure things will work (the last time
I tried uxterm it flaked out on me too often for my comfort).

It's unlikely we'll change RMS's opinion on UTF-8 right now, but I
think we could tone down the language a bit without too much trouble.
How about if we change "deployed even less widely than Latin1" (which
is true some places but not others, at least in my experience) to
"still not universally well-supported" (which is the point, after
all)?  E.g., change this:

  Unicode contains the unambiguous quote characters required, and its
  common encoding UTF-8 is upward compatible with address@hidden  But Unicode
  and UTF-8 are deployed even less widely than Latin1; it would be
  premature to require Unicode support for running essentially every GNU
  program.

to this:

  Unicode contains the unambiguous quote characters required, and its
  common encoding UTF-8 is upward compatible with address@hidden  But Unicode
  and UTF-8 are still not universally well-supported; it would be
  premature to require Unicode support for running essentially every GNU
  program.


One other comment.  These two paragraphs seem out of place:

  ASCII should also be preferred in source code comments, text
  documents, and other contexts, unless there is good reason to do
  something else because of the domain at hand.

  If you need to use non-ASCII characters, for example to represent
  names of contributors, you should normally stick with one encoding, as
  one cannot in general mix encodings reliably.

How about if we create a new section "Non-ASCII characters" and put it
before this new "Quote characters" section?  That might make the
organization clearer.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]