[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] html
Re: [Pan-users] html
Sun, 16 Sep 2012 05:22:48 +0000 (UTC)
Pan/0.139 (Sexual Chocolate; GIT 4162e82 /usr/src/portage/src/egit-src/pan2)
Thufir posted on Sat, 15 Sep 2012 16:52:19 -0700 as excerpted:
> On Tue, 11 Sep 2012 11:38:58 +1000, Steven D'Aprano wrote:
>> mutt does something very similar for email. It can be configured to use
>> a console browser like links, lynx or w3m to dump the html to text,
>> then display that.
>> I see no reason why Pan couldn't use an external dependancy for this
>> like mutt does. As you suggest, stripping tags is hard, there's no
>> reason why Pan should be forced to implement its own when it can push
>> the hard part onto existing tools, of which there are already at least
> Ditto. It could even require some configuration, and be a beta feature,
> that would be fine.
Note that in theory anyway (the hassle of the "in practice" making it
less than ideal, but it /could/ be done), pan's "external editor" feature
should be able to be put to use toward this end.
Back with "old pan", I had a script, pan-attach-kd, posted here a few
times so regulars who have been around long enough may remember it, that
could be set as pan's external editor, making use of the feature and
kdialog (thus the -kd suffix) to allow one to pick a file and encoding
method (yEnc, legacy uuencode, simple pass-thru, the latter being useful
for posting plain-text files in-line), then pass that info in turn to
uuenview (part of uudeview, a uuencode clone that handled yenc before
uuencode got the ability) for encoding. The resulting file would then be
appended to the text file that pan had passed to the "external editor",
before passing control back to pan, for further editing and/or posting.
(Additionally, there was a help option, which gave instructions and
listed the external dependencies, those being bash since I don't make a
distinction between bash and POSIX-compliant shell code, kdialog for the
dialogs, altho there was a VERY crude arbitrary keyword match version
implemented as first proof of concept that didn't require kdialog, and of
course uudeview. Finally, there was a way to pass thru to an actual
external editor as well, if one actually wanted to use the "external
editor" function to do just that, without eliminating the pan-attach
Unfortunately for pan-attach(-kd), when Charles did the C++ rewrite that
started with 0.90, he used a different text-edit widget that worked
better with UTF-8 and the like, but broke the 8-bit-zone ASCII that yEnc
takes advantage of to make it so much more efficient than UUE/MIME-
Base64. So while the script still worked for UUE and text-pass-thru-
mode, it was broken for yEnc, which kind of killed its luster. But it
was still occasionally used here, until Heinrich came along and FINALLY
implemented binary posting mode. Actually, I /still/ use it
occasionally, for pass-thru text-file posting or UUE, since pan's binary
posting ability is great now... as long as you aren't posting to gmane or
other news2mail gateway such that most of your readers will be using mail
clients, many of which have no clue about yEnc (as Travis can no doubt
attest given his reaction when I tried it here... based on his reply, he
doesn't use pan and gmane to read this list, and what he DOES use doesn't
do yEnc!)... which means there's still a place for a script that allows
pan to post UUE or simple text-as-text, as pan's built-in file posting
does NOT, but pan-attach(-kd) still does.
My script, in turn, was based on a much older implementation of the same
basic idea as posted by someone else, putting the external editor to use
for other purposes by calling a script instead, for a different purpose,
gpg signing. Pan has that functionality built-in now too, but the point
here is that the original idea wasn't mine, I simply reimplemented for
attachments, the same idea someone else had used for gpg.
Coming full circle back to the present now, the same idea could be used
right now for html post dumps.
Naturally, by the time the "external editor" gets it, since the intended
purpose IS as an "external editor" for replies, the raw HTML has >-quotes
prepending every original-text line, plus the attribution prepended at
the top and the sig appended at the bottom. That's a bit of a problem,
but nothing insurmountable.
A suitable html-dump script designed to be set as external editor would
therefore have a number of features:
* Mandatory: "Reply wrapper" stripping. The script would have to strip
the attribution and sig lines, as well as the prepended > quote-marks.
* (Semi-)Optional: Depending on the intended HTML parsing target, the
script may want to "dress up" the HTML a bit as well, stripping or adding
selected tags as necessary to make the (presumed) browser happier with
what it's ultimately passed.
* Optional but very useful: Let the user configure whether the script
simply passes it to the configured browser (presumably firefox/chromium/
etc) for display, or passes it to the configured browser to dump, taking
the html-stripped text back to pan, where it would be displayed in the
reply window (now repurposed as a simple display window, one wouldn't
normally send this plain text on, as it wouldn't be marked as quote,
properly attributed, etc) as plain text, now stripped of the HTML.
Optional: Further implement an option that would save the attribution,
sig, etc, before stripping, then reapply them and re-quote the now-
stripped text returned from the HTML parser, for forwarding or reply as
* Optional: Implement a "pass thru to real external editor" option.
When I did that with pan-attach, I made that option dependent on the
existence of a particular environmental variable, PAN_EDITOR or some
such, that if set (and if the pointed at file is an executable, IDR
whether I actually tested for that or not back then, but I think I would
now), would pass the raw file as handed to it by pan, on to the "real"
* Optional: Get fancy and include a help/about dialog, etc.
Obviously, the last two optional features are interactive and would thus
require kdialog/xdialog/zenity/whatever. (AFAIK/IIRC, "zenity" is what
the former gdialog is now called.) Tho at least with kde, one could
script it using konqueror windows too, as I did here with my hotkeys
scripts that replaced the multikey hotkey functionality from khotkeys in
kde3, when kde4 broke it (kdialog unfortunately wasn't appropriate for
that due to the way they implemented input for the the pick-a-line
dialog, but konqueror windows running a script with a read command,
triggering on a single key of input, worked well enough). Presumably one
could do the same with xterm/gterm/whatever.
But a raw reply-wrapper stripper wouldn't need any interactivity just to
do that and invoke an HTML parser on the result, so it should actually be
rather more straightforward than was my pan-attach script.
That said, if it's anything like pan-attach, I won't do anything with it
for at least a year after posting this original idea, hoping someone else
will be motivated enough to do it first. However, again if it's anything
like pan-attach, a year or two down the road (assuming Heinrich or
someone else hasn't implemented such a thing in pan itself by then), I'll
do a very raw but sort-of-functional proof of concept, and post that,
again hoping someone will take the idea and run with it. And again, if
it's anything like pan-attach (and if it is, the feature will still not
be available in pan, but that was of course quite some years before
Heinrich got involved, so...), that raw proof of concept will hit
effectively dead-air, not a single response, despite my hope that someone
will take the concept and run with it. And yet again, if it's anything
like pan-attach, a year or so LATER, I'll decide to see if I can pretty
it up a bit and make it more functional, and after I post the results of
/that/, I'll FINALLY get some feedback. (I even got a couple patches,
which meant at least a couple people found it useful enough to bother,
tho I never did implement them and post an update, and when new-pan broke
the yenc anyway, I lost the incentive I might have had and just continued
to use the existing script.)
IOW, don't count on me to do it. I might get to it eventually... but it
could be years. If you want the functionality, you'll likely either need
to hack up the script yourself or alternatively talk/pay someone into
doing it for you (this being my obvious attempt at the talk variant!
=:^). And if you do, pretty-please post it. =:^)
And if anyone does attempt to hack this up, please at least set variables
for things like the chosen browser right at the top of the script (or
better yet allow them to be set in the environment or read in from a
config file), so others can change them without having to dive into the
guts of the script too far.
The idea, once implemented, would let a user select the raw HTML in pan,
hit the reply button, then the external editor button, to activate the
script. The script in turn would strip the attribution, sig, and >-
quotes, in addition to any other format manipulations necessary before
handing the file off to the configured browser. That browser could then
be chromium/firefox/whatever, to simply display the file it was passed,
or could be links/lynx/whatever, to strip the HTML and hand back the
stripped plain text to pan, where it would then appear in pan's reply
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman