lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV How to work around special characters?


From: Chris Maden
Subject: Re: LYNX-DEV How to work around special characters?
Date: Wed, 17 Sep 1997 16:33:49 -0400

[Philip Webb]
> 970916 Palmer King wrote: 
> > How would one get the following to be accepted by lynx w/o an error:
> >  lynxexec:/bin/tin alt.music.+live+
> > Lynx doesn't like the "+" signs at all, and I don't know HTML or Lynx
> > well enough to come up with a creative alternative.
> 
> try  alt.music.+live+ : ie  &#n; , where  n  is the ASCII code.

No.  This is how to encode characters *meaningful to HTML* *in an HTML
document*.  The plus-sign is not meaningful to HTML, nor is it begin
used in an HTML document.

It *is* being used in a URL, and the plus-sign *is* meaningful to
URLs.  The way to protect a character in a URL is to use the percent
sign and the hex value:

lynxexec:/bin/tin%20alt.music.%2Blive%2B

(What bozo put plus signs in a newsgroup's name?)

This is a *very* common source of confusion on the Web.  When having
trouble with a character, ask yourself, "Who will eat this character?"

HTML will eat ampersands (always), and quotes (in quoted strings, like
attribute values).

HTTP will eat lots of characters - spaces, plus signs, percent signs,
tildes, pipes, and just about anything non-alphanumeric in a query
string.

You need to use the right mechanism to protect the character from the
thing that wants to eat it.

To protect things from HTML, use HTML entity references: &,
".

To protect things from HTTP, use HTTP hex-escaping: %20, %2B, etc.

And you need to protect characters in reverse order of how they might
get eaten.  A URL in an attribute string of an HTML document will
first be parsed by an HTML parser, and then be fed to a Web server.
So protect the URL characters first, with percent signs; then, protect
HTML-endangered characters with entity references.  For instance, if
you have an ampersand in the *data* of an HTTP query, you need to
*hex-escape* it - making it say & doesn't help, since the HTML
parser will decode that before sending it along, and your query will
break.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]