[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wget, & and lynx (was Re: lynx-dev bug report)

From: Leonid Pauzner
Subject: wget, & and lynx (was Re: lynx-dev bug report)
Date: Mon, 25 Oct 1999 22:26:08 +0400 (MSD)

25-Oct-99 11:20 Oliver Seidel wrote:

> -rw-r--r--   1 os10000  private      3689 Oct 25 11:10 
> forum_reply?keepcookie=0&lm=854576442

> <TR><TD><A HREF="../forum_reply?keepcookie=0&amp;lm=854576442">How much wood 
> would a wood chuck ch...</A><TD>1 day old<TD>(last of 9 replies 998 days 
> ago)</TR>

That make me some sence:

1) wget is wrong since failed to translate &amp; (and possibly
other entities) embeded in HTML href= attribute. (*)

2) lynx is wrong since do translate entities for local file names
appeared from directory listing (not one embeded into HTML).
Such symbol as "&" should be added for HTEscapeSome() call
along with "/", "#" and "%" in this particular directory context.
>From the other hand, I have a feeling that "&" symbol may be deprecated
in filename on some OSes.

(*) Also, I know it from my experience, if someone serve a file
named as "a&b.html" via HTTP and supply a link <A href="a&amp;b.html">
from another document, wget failed to download this file
since trying to access "a&amp;b.html" in fact.

A real life example from Apache log file:
 - - [19/Sep/1999:10:54:15 +0200] "GET /tv-b&amp;o.htm HTTP/1.0" 404 212

404 = file not exist, and this is wget.

A workaround such rewriting the link as <a href="tv-b%24o.htm">
by the HTML author helps here but obviously not the case with ? cgi querry
(I am sceptic on the reasons of downloading cgi querry in batch,
so this is another wget lost, unless there is a missed flag).

reply via email to

[Prev in Thread] Current Thread [Next in Thread]