[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Lynx-dev] Question about using lynx as a filter...

From: Charlie Sorsby
Subject: [Lynx-dev] Question about using lynx as a filter...
Date: Fri, 4 Aug 2006 11:27:32 -0600

Hi, y'all,

I'm going to try this again.  I tried posting it before but
received an e-mail telling me that I must reply to the e-mail to
verify that I had, indeed, posted the article.  I did but, if my
query ever appeared, I missed it along with any responses... :(

Oh, yeah -- if you don't think this is of general interest, please
just e-mail your suggestions to me.  Reply-To: is properly set so
"Reply to Sender" should work just fine.

And, please, if all you have to say is that I should use a modern
mail client so that I can just view HTML e-mail, save yourself the
trouble of typing it.  I *like* my mail client and have no intention
of switching.

Now to the problem:

I should like to be able to use lynx as a filter to translate HTML
to plain text.  That is to say, I should like to be able to send a
HTML to lynx's stdin (standard input for non-Unix types) and have
plain text (no HTML) appear at lynx's stdout.

I am able to make this translation with lynx but only if the input
is given as a file name on the command line:

/usr/local/bin/lynx -dump -force_html file.html

Before I continue:
PC% lynx --version
Lynx Version 2.8.5rel.1 (04 Feb 2004)
Built on freebsd4.11 Jan  4 2005 04:58:20

According to the lynx man page it should be possible to convince
lynx to accept input via it's stdin unless I'm misunderstanding
what is being said:

    -  If the argument is only '-', then Lynx expects  to
       receive the arguments from stdin.  This is to allow
       for the potentially very long command line that
       can be associated with the -get_data  or -post_data
       arguments (see below).  It can also be used to
       avoid having sensitive information in the invoking
       command line (which would be visible to other
       processes on most systems), especially when the
       -auth or -pauth options are used.

But when I try something like:

PC% cat file.html | /usr/local/bin/lynx -dump -force_html -

I get something like the following:

Can't Access `file://localhost/home/crs/</HTML>'
Alert!: Unable to access document.

lynx: Can't access startfile 

Clearly, that is only a test; in that situation, I could just use
the command line described earlier with the filename as an argument
on the command line.

For those who may be curious, I want to be able to convert HTML to
text when it is not in a file.  Specifically, I want to be able to
use vi's capability to apply an external command to a unit of text
(e.g. a paragraph or paragraphs).  I want to make a simple-minded
shell script (say in file, html2txt):

/usr/local/bin/lynx -dump -force_html -

So that when I receive one of those damnable e-mails full of HTML,
I can run vi on the message (my mail client allows me to do that),
go to the start of the body of the message and tell vi

        !99} html2txt

and have the script run lynx on the next 99 (or fewer) paragraphs,
converting it into readable text very much as I'm able to do
something like:

        !99}fmt 55

to format text with long lines to shorter lines.

As mentioned earlier, the shell script:

/usr/local/bin/lynx -dump -force_html $@

works fine as long as I feed that HTML to it from a file named on
the script's command line.  But that means that, instead of simply
being able to run vi on the e-mail message, moving to the start of
the HTML, and doing the "!99} html2txt" on the remainder of the
message to replace the HTML with the actual content of the message,
I must, instead, save the message to a file, delete the e-mail
headers, and then run html2txt on that filename, either saving the
output to a file or piping it to a pager to read.  Very awkward.

Thanks for any help (or even for letting me know that I'm mis-
understanding the lynx man page so I can start looking elsewhere
for a solution.
Charlie Sorsby
        Edgewood, NM 87015

reply via email to

[Prev in Thread] Current Thread [Next in Thread]