pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] html


From: Duncan
Subject: Re: [Pan-users] html
Date: Sun, 16 Sep 2012 05:22:48 +0000 (UTC)
User-agent: Pan/0.139 (Sexual Chocolate; GIT 4162e82 /usr/src/portage/src/egit-src/pan2)

Thufir posted on Sat, 15 Sep 2012 16:52:19 -0700 as excerpted:

> On Tue, 11 Sep 2012 11:38:58 +1000, Steven D'Aprano wrote:
> 
> 
>> mutt does something very similar for email. It can be configured to use
>> a console browser like links, lynx or w3m to dump the html to text,
>> then display that.
>> 
>> http://jasonwryan.com/blog/2012/05/12/mutt/
>> 
>> I see no reason why Pan couldn't use an external dependancy for this
>> like mutt does. As you suggest, stripping tags is hard, there's no
>> reason why Pan should be forced to implement its own when it can push
>> the hard part onto existing tools, of which there are already at least
>> three.
> 
> Ditto.  It could even require some configuration, and be a beta feature,
> that would be fine.

Note that in theory anyway (the hassle of the "in practice" making it 
less than ideal, but it /could/ be done), pan's "external editor" feature 
should be able to be put to use toward this end.

Back with "old pan", I had a script, pan-attach-kd, posted here a few 
times so regulars who have been around long enough may remember it, that 
could be set as pan's external editor, making use of the feature and 
kdialog (thus the -kd suffix) to allow one to pick a file and encoding 
method (yEnc, legacy uuencode, simple pass-thru, the latter being useful 
for posting plain-text files in-line), then pass that info in turn to 
uuenview (part of uudeview, a uuencode clone that handled yenc before 
uuencode got the ability) for encoding.  The resulting file would then be 
appended to the text file that pan had passed to the "external editor", 
before passing control back to pan, for further editing and/or posting.

(Additionally, there was a help option, which gave instructions and 
listed the external dependencies, those being bash since I don't make a 
distinction between bash and POSIX-compliant shell code, kdialog for the 
dialogs, altho there was a VERY crude arbitrary keyword match version 
implemented as first proof of concept that didn't require kdialog, and of 
course uudeview.  Finally, there was a way to pass thru to an actual 
external editor as well, if one actually wanted to use the "external 
editor" function to do just that, without eliminating the pan-attach 
option.)

Unfortunately for pan-attach(-kd), when Charles did the C++ rewrite that 
started with 0.90, he used a different text-edit widget that worked 
better with UTF-8 and the like, but broke the 8-bit-zone ASCII that yEnc 
takes advantage of to make it so much more efficient than UUE/MIME-
Base64.  So while the script still worked for UUE and text-pass-thru-
mode, it was broken for yEnc, which kind of killed its luster.  But it 
was still occasionally used here, until Heinrich came along and FINALLY 
implemented binary posting mode.  Actually, I /still/ use it 
occasionally, for pass-thru text-file posting or UUE, since pan's binary 
posting ability is great now... as long as you aren't posting to gmane or 
other news2mail gateway such that most of your readers will be using mail 
clients, many of which have no clue about yEnc (as Travis can no doubt 
attest given his reaction when I tried it here... based on his reply, he 
doesn't use pan and gmane to read this list, and what he DOES use doesn't 
do yEnc!)... which means there's still a place for a script that allows 
pan to post UUE or simple text-as-text, as pan's built-in file posting 
does NOT, but pan-attach(-kd) still does.

My script, in turn, was based on a much older implementation of the same 
basic idea as posted by someone else, putting the external editor to use 
for other purposes by calling a script instead, for a different purpose, 
gpg signing.  Pan has that functionality built-in now too, but the point 
here is that the original idea wasn't mine, I simply reimplemented for 
attachments, the same idea someone else had used for gpg.

Coming full circle back to the present now, the same idea could be used 
right now for html post dumps.

Naturally, by the time the "external editor" gets it, since the intended 
purpose IS as an "external editor" for replies, the raw HTML has >-quotes 
prepending every original-text line, plus the attribution prepended at 
the top and the sig appended at the bottom.  That's a bit of a problem, 
but nothing insurmountable.

A suitable html-dump script designed to be set as external editor would 
therefore have a number of features:

* Mandatory: "Reply wrapper" stripping.  The script would have to strip 
the attribution and sig lines, as well as the prepended > quote-marks.

* (Semi-)Optional: Depending on the intended HTML parsing target, the 
script may want to "dress up" the HTML a bit as well, stripping or adding 
selected tags as necessary to make the (presumed) browser happier with 
what it's ultimately passed.

* Optional but very useful:  Let the user configure whether the script 
simply passes it to the configured browser (presumably firefox/chromium/
etc) for display, or passes it to the configured browser to dump, taking 
the html-stripped text back to pan, where it would be displayed in the 
reply window (now repurposed as a simple display window, one wouldn't 
normally send this plain text on, as it wouldn't be marked as quote, 
properly attributed, etc) as plain text, now stripped of the HTML.

Optional: Further implement an option that would save the attribution, 
sig, etc, before stripping, then reapply them and re-quote the now-
stripped text returned from the HTML parser, for forwarding or reply as 
plain-text.

* Optional:  Implement a "pass thru to real external editor" option.  
When I did that with pan-attach, I made that option dependent on the 
existence of a particular environmental variable, PAN_EDITOR or some 
such, that if set (and if the pointed at file is an executable, IDR 
whether I actually tested for that or not back then, but I think I would 
now), would pass the raw file as handed to it by pan, on to the "real" 
external editor.

* Optional: Get fancy and include a help/about dialog, etc.

Obviously, the last two optional features are interactive and would thus 
require kdialog/xdialog/zenity/whatever. (AFAIK/IIRC, "zenity" is what 
the former gdialog is now called.)  Tho at least with kde, one could 
script it using konqueror windows too, as I did here with my hotkeys 
scripts that replaced the multikey hotkey functionality from khotkeys in 
kde3, when kde4 broke it (kdialog unfortunately wasn't appropriate for 
that due to the way they implemented input for the the pick-a-line 
dialog, but konqueror windows running a script with a read command, 
triggering on a single key of input, worked well enough).  Presumably one 
could do the same with xterm/gterm/whatever.

But a raw reply-wrapper stripper wouldn't need any interactivity just to 
do that and invoke an HTML parser on the result, so it should actually be 
rather more straightforward than was my pan-attach script.

That said, if it's anything like pan-attach, I won't do anything with it 
for at least a year after posting this original idea, hoping someone else 
will be motivated enough to do it first.  However, again if it's anything 
like pan-attach, a year or two down the road (assuming Heinrich or 
someone else hasn't implemented such a thing in pan itself by then), I'll 
do a very raw but sort-of-functional proof of concept, and post that, 
again hoping someone will take the idea and run with it.  And again, if 
it's anything like pan-attach (and if it is, the feature will still not 
be available in pan, but that was of course quite some years before 
Heinrich got involved, so...), that raw proof of concept will hit 
effectively dead-air, not a single response, despite my hope that someone 
will take the concept and run with it.  And yet again, if it's anything 
like pan-attach, a year or so LATER, I'll decide to see if I can pretty 
it up a bit and make it more functional, and after I post the results of 
/that/, I'll FINALLY get some feedback.  (I even got a couple patches, 
which meant at least a couple people found it useful enough to bother, 
tho I never did implement them and post an update, and when new-pan broke 
the yenc anyway, I lost the incentive I might have had and just continued 
to use the existing script.)

IOW, don't count on me to do it.  I might get to it eventually... but it 
could be years.  If you want the functionality, you'll likely either need 
to hack up the script yourself or alternatively talk/pay someone into 
doing it for you (this being my obvious attempt at the talk variant! 
=:^).  And if you do, pretty-please post it. =:^)

And if anyone does attempt to hack this up, please at least set variables 
for things like the chosen browser right at the top of the script (or 
better yet allow them to be set in the environment or read in from a 
config file), so others can change them without having to dive into the 
guts of the script too far.


The idea, once implemented, would let a user select the raw HTML in pan, 
hit the reply button, then the external editor button, to activate the 
script.  The script in turn would strip the attribution, sig, and >-
quotes, in addition to any other format manipulations necessary before 
handing the file off to the configured browser.  That browser could then 
be chromium/firefox/whatever, to simply display the file it was passed, 
or could be links/lynx/whatever, to strip the HTML and hand back the 
stripped plain text to pan, where it would then appear in pan's reply 
window.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]