[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] Re: one question of wget
From: |
Micah Cowan |
Subject: |
[Bug-wget] Re: one question of wget |
Date: |
Wed, 07 Jan 2009 10:25:30 -0800 |
User-agent: |
Thunderbird 2.0.0.18 (X11/20081125) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
wang qiang wrote:
> Hello there,
Hi. In the future, please use the list (address@hidden) for support
requests. I can't promise to answer personal-mail support requests.
> When I tested the WGet, I met a question.
>
> I used the command ./src/wget -r -l6 http://news.yahoo.com
> to get the pages, it worked well.
>
> But I use the command
>
> ./src/wget -r -l6 http://csce.uark.edu
>
> it just could get the first page i.e. index.html, and then halted.
>
> Could you please tell me how to solve this problem? I found that there
> was a "robot.txt" in the folder when retrieving from news.yahoo.com,
> but no "robot.txt" when retrieving from csce.uark.edu. Thanks,
csce.uark.edu includes many links to hosts other than "csce.uark.edu".
www.csce.uark.edu, for example, and some others for hosting images I
think. Wget by default will refuse to follow links to other hosts; you
need to add -H -D csce.uark.edu to get the other links (changing the
requested URI to www.csce.uark.edu doesn't help much, because there are
many links to csce.uark.edu (without www) as well).
- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAklk85oACgkQ7M8hyUobTrHdAgCfTYu2QwDJiXW3n1EnhvWq9kar
GBIAnjwwTUnUFO7D75bzYhKk5P2FF7hw
=4Xjm
-----END PGP SIGNATURE-----
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-wget] Re: one question of wget,
Micah Cowan <=