[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] base64 ... just looking for advice

From: Ken Hornstein
Subject: Re: [Nmh-workers] base64 ... just looking for advice
Date: Wed, 27 Jan 2016 10:52:51 -0500

Man, nmh is quiet for a while, so I think I can go on vacation ... and
look what happens!  Also, I get stuck at Disney World in a historic
snowstorm so I was delayed even longer.

I see you already got the answer to your question, and David Levine
already covered some of the finer points of mhfixmsg.  But, let me point
out some larger meta-issues.

>So the thing I love absolutely most about MH/NMH is that my email is
>just files on my disk that I can grep through.

I can understand why you say that, and while it IS true that a) nmh stores
each message in a separate file, and b) it is possible to use grep to search
those files, the conclusion you're making, c) grep will return useful search
results, is NOT necessarily true.

The problem is that you're thinking that an 'email file' consists of
ASCII, or just plain text in a single character set.  That is not true,
and hasn't really been true for a few decades.  As Wolfgang Denk points
out, when talking about the output of mhfixmsg:

>Agreed - but this leaves us with a problem; as we now have a single
>file with different parts in different character sets.

The thing is, _that's a completely valid email_.  It's not a problem!
It's only a problem if you persist in thinking that 'email' consists
of plain text in a single character set.  I know, it was that way for
a while and probably the most of the email you get is still like that.
But here we have 30-year-old expectations colliding with modern reality.

>However, an awful lot of my email is coming to me base-64-encoded,
>for no particularly obvious reason... probably an accented character
>or (in one case) line-drawing characters.

When I ran into this, the answer was that basically some MUAs always
base64-encoded everything (the one that ran on Blackberry was one
example I encountered).  Unfortunately, pretty much every MUA can handle
this just fine, so we have to deal with it.

So mhfixmsg is kind of a Band-Aid on a larger problem.  It's not that I
have objections to Band-Aids (see replyfilter), but it's important to
understand the limitations here.

Here are the basic constraints:

1) We assume 'email files' are RFC 5322-format messages (well, okay, with
   the exceptiomn that they've been converted to Unix line ending format.
   A minor issue that we can mostly ignore).
2) RFC 5322-format messages contain bytes that can be encoded different
   ways, and can be in different character sets (or not even be text
   at all).  So in the larger case they're not valid to be processed with
   regular Unix text processing tools (because RFC 5322 != text).
3) We can convert the RFC 5322 messages to something that's more friendly
   to use with regular text processing tools (that's what mhfixmsg does).
   But we can't convert them to 'text' completely, because some parts of
   RFC 5322 cannot be represented in an unencoded form (well, I suppose
   we could turn them into message/global messages, but that assumes
   that you want everything in UTF-8 and we don't actually handle those
   messages very well).

So the short answer is yes, use mhfixmsg, but I think the only long-term
solution is to make nmh tools smarter.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]