[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LYNX-DEV How to work around special characters?
From: |
Chris Maden |
Subject: |
Re: LYNX-DEV How to work around special characters? |
Date: |
Wed, 17 Sep 1997 16:33:49 -0400 |
[Philip Webb]
> 970916 Palmer King wrote:
> > How would one get the following to be accepted by lynx w/o an error:
> > lynxexec:/bin/tin alt.music.+live+
> > Lynx doesn't like the "+" signs at all, and I don't know HTML or Lynx
> > well enough to come up with a creative alternative.
>
> try alt.music.+live+ : ie &#n; , where n is the ASCII code.
No. This is how to encode characters *meaningful to HTML* *in an HTML
document*. The plus-sign is not meaningful to HTML, nor is it begin
used in an HTML document.
It *is* being used in a URL, and the plus-sign *is* meaningful to
URLs. The way to protect a character in a URL is to use the percent
sign and the hex value:
lynxexec:/bin/tin%20alt.music.%2Blive%2B
(What bozo put plus signs in a newsgroup's name?)
This is a *very* common source of confusion on the Web. When having
trouble with a character, ask yourself, "Who will eat this character?"
HTML will eat ampersands (always), and quotes (in quoted strings, like
attribute values).
HTTP will eat lots of characters - spaces, plus signs, percent signs,
tildes, pipes, and just about anything non-alphanumeric in a query
string.
You need to use the right mechanism to protect the character from the
thing that wants to eat it.
To protect things from HTML, use HTML entity references: &,
".
To protect things from HTTP, use HTTP hex-escaping: %20, %2B, etc.
And you need to protect characters in reverse order of how they might
get eaten. A URL in an attribute string of an HTML document will
first be parsed by an HTML parser, and then be fed to a Web server.
So protect the URL characters first, with percent signs; then, protect
HTML-endangered characters with entity references. For instance, if
you have an ampersand in the *data* of an HTTP query, you need to
*hex-escape* it - making it say & doesn't help, since the HTML
parser will decode that before sending it along, and your query will
break.
-Chris
--
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>
;
; To UNSUBSCRIBE: Send a mail message to address@hidden
; with "unsubscribe lynx-dev" (without the
; quotation marks) on a line by itself.
;