[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LYNX-DEV -crawl, etc: need "filename-filter"
From: |
Joe Kincaid |
Subject: |
Re: LYNX-DEV -crawl, etc: need "filename-filter" |
Date: |
Thu, 5 Dec 1996 21:28:48 -0600 (CST) |
On Thu, 5 Dec 1996, David Combs wrote:
> Two features would be nice:
>
> 1: to be able to supply a regexp that a name has to "pass"
> before being accepted for downloading.
Hmmm, for certain types (example .tex vs .ps as you mentioned) couldn't
you set the mime types to reject all but the formats you want? I admit
I have no idea how to actually _do_ that, but it seems possible. :-)
> 2: more flexibly, but far more time-consuming and i/o,
> would be to do retrieval in two passes: first one
> simply got a list of downloadable names, and stuck
> them all into a file.
>
> Then the user could use his favorite editor or egrep
> or whatever to get the list of files he DOES want downloaded.
>
> He then runs (another?) lynx, gives it that file of
> filenames, no "crawling" this time, and it just
> downloads the whole bunch.
Once you have the file of URLs to download, you could insert the
lynx -dump command in front of each one and then append a file name
to redirect stdout at the end of the command line. You would want -source
instead of -dump for non-html documents, btw.
> I believe this feature would be useful also for people
> who had simply built up a list of things they wanted
> downloaded, and wanted to "automate" it.
I agree.
> Note: would need settable option to say "die on error",
> vs just go get next file in list, if link not there.
If each was its own invocation of lynx, then it would do this. I don't
know if there would be a significant savings in any resource by building
this into lynx as a -multiple_dump feature.
> 3: monkeywrench tossed into machine:
>
> Suppose there were two "README" files, from different
> parts of the foreign tree. What then?
You could deal with this when you edit the script.
> But no such luck -- so we are FORCED to do these
> infinite downloads, IF we want to read on paper, sitting
> in chair, lying in bed, or at desk, marking up with pencil.
I definitely agree that sometimes it is easier to read a hardcopy than an
on-line copy.
Joe
-------
Joseph Kincaid | Mathematics is the alphabet with which God
address@hidden | has written the universe. -- Galileo
KSU - Mathematics Department | (except he said it in Italian, of course.)
;
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
;