lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Q: control lynx from C


From: David Woolley
Subject: Re: lynx-dev Q: control lynx from C
Date: Fri, 19 Feb 1999 08:40:57 +0000 (GMT)

> I have a question: Is it possible to control lynx from C?
> I would expect a command line option that allows for lynx
> to read the commands not from keyboard but from a pipe or a file.

That's like using a sledgehammer to crack a nut.  Lynx is an 
interactive browser based on libWWW, although a modified one that
has diverged from the official one.  libWWW is the API for
C applications.

Lynx does have some ability to render broken HTML, which might be
of advantage, I suppose.

At a higher level, wget for fetching and nsgmls for parsing are
likely to be more appropriate, although nsgmls does need real HTML,
which is rather a rare commodity!

> The idea is to connect to a site and  automate some browsing and 
> getting info from there.

Most sites with dynamic content would object to this, as you would
be bypassing their advertising and costing them money without providing
them with revenue.  If you do use Lynx, please change the user agent
string to identify the user agent as your too, not as Lynx, otherwise
Lynx could conceivably get barred from those sites by association with
your use of it.

On no account, do any extensive crawling of a site unless there are
no restrictions in http://site/robots.txt or in corresponding meta 
elements in the pages themselves.  In particular, don't use any automated
tool against the UK mirror of IMDB, as they do monitor and block site
crawlers.

(The objection to crawling is partly advertising revenue based and partly
based on the fact that the site incurs bandwidth costs, but the recipient
probably reads very little of what they have fetched.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]