[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] wget -r not working with www.archive.org
From: |
Micah Cowan |
Subject: |
Re: [Bug-wget] wget -r not working with www.archive.org |
Date: |
Mon, 26 Oct 2009 09:33:15 -0700 |
User-agent: |
Thunderbird 2.0.0.23 (X11/20090817) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Gray wrote:
> wget does not seem to want to get from the WayBackMachine -
> http://www.archive.org stored web sites.
archive.org pages contain funny JavaScript code that allows it to work
properly in browsers, and not in Wget. IIRC, it's that they set the HTML
"base" tag so that links are retrieved from the original site (whether
it exists or not), rather than archive.org; but the JavaScript will then
rewrite the base tag so that browser clicks retrieve them from archive.org.
Please note that archive.org's FAQ explicitly forbids downloading local
archives via tools such as wget.
- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkrlz0oACgkQ7M8hyUobTrEo7gCeMs7jOb60bNXdh3ptRG/XbPbY
mQwAn0wp+jJcG8RmGO9Fcr3db6x7AMwM
=SAVy
-----END PGP SIGNATURE-----