[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] scan or show of UTF-encoded headers?

From: Valdis . Kletnieks
Subject: Re: [Nmh-workers] scan or show of UTF-encoded headers?
Date: Mon, 14 Feb 2005 15:34:34 -0500

On Mon, 14 Feb 2005 19:35:36 +0100, Harald Geyer said:

> Obviously any script which tries to do the above runs into the same
> problem that prevents nmh from doing it itself: The script would need
> to know which charsets the terminal can handle and how to tell it.
> Also changing the terminal might confuse other programs.
> I guess it would be much easier und less prone to error to just
> implement transcoding of messages through iconv instead of trying
> to adapt the display on a per message basis.

In general, you *can't* do a good job of using iconv to mash things between
the various iso8859-* charsets.  There *will* be lossage - after all, there
is a *reason* they're up to -15, namely that one isn't sufficient.  So whichever
one you're in, there *will* be lossage for the other 14.

On the flip side, it's possible to do lossless conversion *from* any 8859-*
into the UTF-8 space.  So teaching the code that currently does MM_CHARSET
that if the user is in a UTF-8 environ, it should use iconv to convert 8859
to utf-8 is a better solution.

And yes, it's possible that the user is in a utf-8 environment, but doesn't
have actual font glyghs for all the planes (so, for instance Hebrew or
Cyrillic characters don't display).  This is actually a non-issue, for 2 

1) If they don't have the Hebrew glyghs installed, there's nothing you could
have done anyhow.

2) On the other hand, it's fairly safe to assume that if they're in a UTF-8
locale, that their software has at least enough smarts to put up a "unknown
character" box at that position.

> I remember the gnus people using big sets of tables to do a mixture
> of transcoding and unifying between character sets which led to
> messages being split into several parts of different character sets,
> when it didn't work correctly. I don't know what had been their reason
> to not use iconv.

At least in the MULE-ized versions of Emacs and XEmacs, the basic reason for
the big sets of tables is because they're using their own internal encoding
instead of UTF-mumble (which is also why they couldn't use iconv).  As a
result, the big tables are visible to you.  If it used iconv instead, the big
tables are still there - just hidden off in /usr/lib/iconv where you don't
usually see them.

Attachment: pgp_ZrY9zR8ZF.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]