|
From: | Jochen Roderburg |
Subject: | Re: [Bug-wget] New to this, large files constraints? |
Date: | Sat, 17 Sep 2011 14:09:54 +0200 |
User-agent: | Internet Messaging Program (IMP) H3 (4.3.7) |
Zitat von Jochen Roderburg <address@hidden>:
Zitat von Jochen Roderburg <address@hidden>:This is really an "interesting" problem:http://socds.huduser.org/permits/output_monthly_csv.odb?outpref=csv&geoval=state&datatype=monthlyF&varlist=1%232%233&yearlist=2000%232001%232002%232003%232004%232005%232006%232007%232008%232009%232010&statelist=13%2337%2345&msalist=+&cbsalist=+&bppllist=+&cntylist=13033%2313073%2313189%2313245%2337007%2337025%2337071%2337119%2337179%2345001%2345003%2345005%2345007%2345009%2345011%2345013%2345015%2345017%2345019%2345021%2345023%2345025%2345027%2345029%2345031%2345033%2345035%2345037%2345039%2345041%2345043%2345045%2345047%2345049%2345051%2345053%2345055%2345057%2345059%2345061%2345063%2345067%2345069%2345065%2345071%2345073%2345075%2345077%2345079%2345081%2345083%2345085%2345087%2345089%2345091&COUNTYSUM=YES&COUNTYALL=+&COUNTYGRP=+&STATESUM=+&STATEALL=+&METROSUM=+&METROALL=+&METRO=+&CBSA=+&PLACEGRP=+&CSUMNAME=&JSUMNAME=+&geo=state&chron=monthlyFOn Windows you may see older versions of wget give the error message "Result too large" but it means filename too long. In Linux "File name too long". And wget 1.13 --trust-server-names doesn't work with this site's response.. should it?Well, in theory it should work with "--content-disposition=on", as the webapplication sends a Content-Disposition header with a filename:---response begin--- HTTP/1.1 200 OK Content-Type: application/vnd.ms-excel Server: Microsoft-IIS/6.0 Content-Disposition: attachment; filename=BuildingPermits.csv; X-Powered-By: ASP.NET Date: Sat, 17 Sep 2011 05:58:06 GMT Connection: close ---response end---... but wget seems to bail out with the overlong filename *before* it reads the response headers.After further examination I must retract the "before" assumption.Debug outputs show the GET response headers with Content-Disposition and the error message comes after it, so it looks more as if for some unknown reason the Content-Disposition is simply ignored.
Sorry for the noise, as often the whole truth is more complicated and one has to test very carefully to avoid all side-effects.
New result: it works fine as expected with wget default options and --content-disposition=on
It does not work, however, with the additional option --timestamping (makes no sense of course for this type of dynamically generated output, but I have it as my default and somehow it seems to have also crept into my tests, although I tried to avoid it ;-).
FWIW, in this case I see the following sequence in the debug output:wget does a HEAD request first and gets a "standard" response *without* Content-Disposition.
Then it makes a GET and gets the Content-Disposition. And in this situation it seems to ignore this. Best regards, Jochen Roderburg
[Prev in Thread] | Current Thread | [Next in Thread] |