[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Help: 'wget --page-requisites' is slow
From: |
David Bodin |
Subject: |
Re: [Bug-wget] Help: 'wget --page-requisites' is slow |
Date: |
Sat, 15 Jun 2019 17:01:22 -0700 |
Sorry, just wanted to clarify that the time difference I'm seeing is a
direct comparison between the wget command and a hard refresh of the
webpage, so no caching to assist with page request times.
On Sat, Jun 15, 2019 at 3:49 PM David Bodin <address@hidden> wrote:
> Hello wget community,
>
> *Goal*
> My goal is to download a single webpage to be fully functional offline in
> the same time it takes a browser to request and show the page.
>
> *Problem*
> The following command downloads a page and makes it fully functional
> offline, but it takes approximately 35 seconds where the browser requests
> and shows the page in about 5 seconds. Can someone please help me
> understand why my *wget* command is taking *so much longer* and how I can
> make it faster? Or is there any locations or chat groups where I can seek
> help? Sincere thanks in advance for any help anyone can provide.
>
> *wget --page-requisites --span-hosts --convert-links --adjust-extension
> --execute robots=off --user-agent Mozilla --random-wait
> https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/
> <https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/>*
>
> *More info & attempted solutions*
>
> 1. I removed '*--random-wait*' because I thought it might be adding
> time for each file request, but this did nothing.
> 2. I thought the https protocol might slow it down with extra calls
> back and forth for each file so I added '*--no-check-certificate*',
> but this did nothing.
> 3. I read there could be an issue with IPv6 so I added '*--inet4-only*',
> but this did nothing.
> 4. I read the dns could slow things down so I added '*--no-dns-cache*',
> but this did nothing.
> 5. I thought perhaps *wget* was downloading the assets sequentially
> one at a time so I tried to run multiple commands concurrently with between
> 3 and 16 threads/processes by removing '*--convert-links*' adding '
> *--no-clobber*' in the hopes that with multiple files would be
> downloaded at the same time and after all files were downloaded that I
> could run the command again removing '*--no-clobber*' and '
> *--page-requisites*' and adding '*--convert-links*' to make it fully
> functional offline. but this did nothing. I also thought that multiple
> threads would speed things up because it would remove the latency of the
> https checks by doing multiple at a time, but I didn't observe this.
> 6. I read an article about running the command as root user in case
> there were any limits on a given user, but this did nothing.
>
> Sincere thanks in advance, again,
> Dave
>