bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Help: 'wget --page-requisites' is slow


From: David Bodin
Subject: Re: [Bug-wget] Help: 'wget --page-requisites' is slow
Date: Sat, 15 Jun 2019 17:01:22 -0700

Sorry, just wanted to clarify that the time difference I'm seeing is a
direct comparison between the wget command and a hard refresh of the
webpage, so no caching to assist with page request times.

On Sat, Jun 15, 2019 at 3:49 PM David Bodin <address@hidden> wrote:

> Hello wget community,
>
> *Goal*
> My goal is to download a single webpage to be fully functional offline in
> the same time it takes a browser to request and show the page.
>
> *Problem*
> The following command downloads a page and makes it fully functional
> offline, but it takes approximately 35 seconds where the browser requests
> and shows the page in about 5 seconds. Can someone please help me
> understand why my *wget* command is taking *so much longer* and how I can
> make it faster? Or is there any locations or chat groups where I can seek
> help? Sincere thanks in advance for any help anyone can provide.
>
> *wget --page-requisites --span-hosts --convert-links --adjust-extension
> --execute robots=off --user-agent Mozilla --random-wait
> https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/
> <https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/>*
>
> *More info & attempted solutions*
>
>    1. I removed '*--random-wait*' because I thought it might be adding
>    time for each file request, but this did nothing.
>    2. I thought the https protocol might slow it down with extra calls
>    back and forth for each file so I added '*--no-check-certificate*',
>    but this did nothing.
>    3. I read there could be an issue with IPv6 so I added '*--inet4-only*',
>    but this did nothing.
>    4. I read the dns could slow things down so I added '*--no-dns-cache*',
>    but this did nothing.
>    5. I thought perhaps *wget* was downloading the assets sequentially
>    one at a time so I tried to run multiple commands concurrently with between
>    3 and 16 threads/processes by removing '*--convert-links*' adding '
>    *--no-clobber*' in the hopes that with multiple files would be
>    downloaded at the same time and after all files were downloaded that I
>    could run the command again removing '*--no-clobber*' and '
>    *--page-requisites*' and adding '*--convert-links*' to make it fully
>    functional offline. but this did nothing. I also thought that multiple
>    threads would speed things up because it would remove the latency of the
>    https checks by doing multiple at a time, but I didn't observe this.
>    6. I read an article about running the command as root user in case
>    there were any limits on a given user, but this did nothing.
>
> Sincere thanks in advance, again,
> Dave
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]