[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wget2 | crawl for urls without sending any http requests? (#554)
From: |
@rockdaboot |
Subject: |
Re: wget2 | crawl for urls without sending any http requests? (#554) |
Date: |
Sun, 04 Jul 2021 18:36:27 +0000 |
Tim Rühsen commented:
Not sure if i get it right... you want to download a single HTML file and print
out all the URLs in there ?
Then take a look into `examples/print_html_urls.c`.
The point with recursive spidering is that you don't know from the URL what
kind of file it is. So wget2 has to check the content type, which needs a
request. If it is text/html or one of the other supported file types that
contain more URLs, that file has to be downloaded and parsed for more URLs. And
so on (recursive)...
--
Reply to this email directly or view it on GitLab:
https://gitlab.com/gnuwget/wget2/-/issues/554#note_618223964
You're receiving this email because of your account on gitlab.com.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: wget2 | crawl for urls without sending any http requests? (#554),
@rockdaboot <=