pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Character Encoding option for Header Pane


From: Duncan
Subject: Re: [Pan-users] Character Encoding option for Header Pane
Date: Sun, 12 Jun 2011 04:51:55 +0000 (UTC)
User-agent: Pan/0.134 (Wait for Me; GIT 717b0ac branch-testing)

kwsk posted on Sat, 11 Jun 2011 11:17:14 +0900 as excerpted:

> There were some articles which posted 2bytes Character Subject but, I
> did not find encoding option for Header pane.
> 
> any plan to add encoding option for header pane in the future?

My understanding is that the RFCs specify 7-bit ASCII for all overview-
mandatory headers and possibly for all headers, period.

While this was historically the case for the entire message, originally 
including the body, with the introduction of MIME headers that changed 
for the body, as the MIME headers (the content-type header, charset=value 
setting, in particular) allow specifying the encoding of the body.

But that doesn't work for the headers, or at least overview-mandatory 
headers such as subject and from, because the MIME headers aren't 
themselves overview-mandatory and thus are as a matter of practice and 
spec not necessarily available at the time the message list (based on 
data from the overviews) is customarily displayed, before full message 
download.

The adopted workaround has been to directly ASCII encode the charset data 
as escape sequences in the affected headers themselves.  As such, if you 
look at the raw message (pan's save-as-text option or grab the message-id 
and check pan's cache), headers should be in ASCII, with escape sequences 
such as =?ISO-xxxx followed by specific character escapes as necessary.  
(I've never looked at the specific encoding details.)

Here's an example I dug out of my cache (whether it's actually valid or 
not I can't say, but pan seems to display it correctly...)  The posts in 
question appeared in March of 2010 on the kde-linux list, as fetched in 
pan from gmane's list2news service at news.gmane.org , so you can 
download the set of messages yourself, if you like.

From: =?ISO-8859-1?Q?Denis_A=2E_Alto=E9_Falqueto?= 
<address@hidden>

For such in-line ASCII encoded charset info, as mentioned, pan already 
seems to work, altho I imagine you'd need to be using a font with the 
desired gliphs at the named character values, so it's unlikely I'd see 
the correct chars for, say Chinese, here, for instance, since it's 
unlikely that the font I have pan set to display headers with supports it.

If people are actually posting raw un-ascii-armored subject/author lines 
in other charsets, AFAIK, they're violating the RFCs.  As such, no 
receiving client can rationally be expected to parse or display these 
messages correctly, since correct parsing and display depends on the 
specification laid down in the relevant RFCs.  While it's certainly 
possible for some strange client or other to have implemented whatever 
wild scheme they may have thought up, if it's not RFC compliant, they 
really can't expect proper interoperability.

All that said, have you tried pan's per-group default character encoding 
option, accessible thru group preferences for each group?  I believe that 
applies to posting, but don't know whether it applies to assumed subject/
author charset or not, and as you didn't say you had tried it...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]