[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] html

From: Duncan
Subject: Re: [Pan-users] html
Date: Tue, 11 Sep 2012 00:21:40 +0000 (UTC)
User-agent: Pan/0.139 (Sexual Chocolate; GIT 4162e82 /usr/src/portage/src/egit-src/pan2)

Joe Zeff posted on Mon, 10 Sep 2012 13:46:08 -0700 as excerpted:

> On 09/10/2012 01:23 PM, Thufir wrote:
>> did you guys decide to go forward with a rudimentary html parser, and,
>> if so, what's the timeline, please?
> Judging by what I've read here in the past, coding will start on the
> Twelfth of Never.

<typical Duncan =:^)>That's correct, but not entirely complete.</typical>

Quoting a famous line, "It depends what the meaning of 'is' is."  Or in 
this case, what the meaning of "rudimentary html parser" is.

It's absolutely true that there's close to zero (I take it Thufir's an 
exception, the only one /I/ know of) interest on the list for display of 
HTML in the "web browser display" sense.  There's an EXTREMELY strong 
sense of "If you want to display an HTML formatted page, use a browser; 
if you want to post HTML, use the web and post a link if you want/need 
to."  Pan's a "pimp-ass newsreader", not a web browser, and implementing 
HTML/XML display both properly and securely takes an IMMENSE amount of 
resources.  If the user trusts the HTML/XML enough, they can always save 
it and open in a dedicated web browser, which hopefully has enough 
development resources to implement HTML/XML display both properly and 
securely, since UNLIKE a news client such as pan, that's what a web 
client DOES.

However, there HAS been discussion of the possibility of implementing a 
simple "dumb tag stripper" mode (which will actually need to be 
reasonably smart if it's not to mistake "meta-observation" tags such as 
those in my first paragraph for HTML/XML tags and strip them), much as 
claws-mail for example does to /surprisingly/ good effect.

The idea is to "simply" strip out any HTML/XML tags, leaving the plain 
text.  But as I said, it's not that simple, really.  In addition to "meta 
observation" tags such as those I used above, anchor tags (used for 
links) arguably shouldn't be stripped entirely, simply stripped of their 
HTML, leaving the description of the link and the URL as plain text.

Image tags are another question.  Do you treat them like anchor tags and 
strip the HTML but convert the alt-text and the URL to plain text, or 
strip them entirely?  Claws strips them entirely, figuring enough of them 
are ads and the like, that it's better without them.  After all, one can 
always open the HTML message, presented as an attachment, in a web 
client, if it's considered trustworthy enough and worth the hassle.

Personally, I was rather negative on this whole idea, until I saw how 
effectively claws-mail implemented it (after having little choice but to 
switch to /something/ other than the kmail I'd been using for nearly a 
decade, when they akonadified and broke everything, tho I've now 
discovered claws to be a better fit for my usage anyway, so can sort of 
thank the kdepim folks for the push much as I can thank the MS folks for 
the push to Linux they gave me with eXPrivacy).

And claws-mail, like pan, is gtk-based.  I don't remember whether it's C+
+ based as pan now is, or C based, but regardless, it's likely their 
impressively effective implementation could at least provide some hints 
to anyone wishing to try to code up a similar solution for pan, even if 
the code isn't in practice either simply liftable, or better yet, 
reimplemented in a library both could share (possibly along with sylpheed, 
which claws forked from, and who knows what other apps could make use of 

But I'm not a coder, and even if I was, while it'd be nice, unless the 
claws implementation could be dropped into pan nearly as-is, I strongly 
suspect I'd find more "itchy" itches to scratch.  And I know of no one 
else specifically taking up that project either, tho for all I know it's 
possible someone's going to announce their previously private project 
tomorrow, saying here's a beta, test it to pieces!

So rehashing, I don't believe anyone's seriously interested in pan having 
a proper HTML display mode.  That's a SERIOUS bit of CONTINUOUS work that 
even dedicated browser projects have trouble pulling off both properly 
and with continuous security, there's NO WAY something like pan could do 
it, without SERIOUSLY affecting its ability to maintain and improve its 
primary intended functionality as a "pimp-ass newsreader".

And even if it could be done reasonably well, the result would no longer 
be pan, it'd be some other product.  And if you want something that's not 
pan, go find it and use it, or create it.  Don't try to make pan into 
something it's not, and never can be, without destroying what pan /is/.

But a rather simpler (tho still not simple) HTML-to-plain-text mode has 
been discussed.  We know it can work and DOES work impressively well in 
claws-mail.  But to my knowledge, there's nobody actually working on such 
a thing, and realistically, unless the claws implementation could be very 
nearly dropped whole into pan with little additional work, I don't find 
it particularly likely that anyone with the necessary skills AND 
interested enough in pan to go to the trouble, finds that feature a high 
enough priority to ever actually get it done.  But unlike the full 
fledged browser-style HTML/XML parser, this one's at least reasonable in 
theory, and wouldn't so drastically change pan that it would no longer be 
pan and people might as well just use something different to start with.

All IMO of course, but with all humility, I guess it's worth /something/ 
after a decade (in a couple months I believe, actually, November 10 or so 
should be my 10-year first-post anniversary IIRC from looking it up on 
gmane, a few months ago, I'll have to check again as the date gets 
closer, and maybe go out for dinner that day or something =:^) of helping 
on the pan lists.

Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

reply via email to

[Prev in Thread] Current Thread [Next in Thread]