[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev Escaping URLs on the command line?
From: |
Nelson H. F. Beebe |
Subject: |
Re: lynx-dev Escaping URLs on the command line? |
Date: |
Wed, 24 Apr 2002 06:10:21 -0600 (MDT) |
Walter Ian Kaye <address@hidden> writes on 24 Apr
2002 00:50:01 -0700 with a request about how to escape special
characters in URLs.
The easy thing to do is simply represent problem characters in
uppercase hexadecimal, e.g., %C0 for character 196 decimal. This is
permitted anywhere in a URL.
>From RFC 1630, available at
ftp://ftp.internic.net/rfc/rfc1630.txt
ftp://ftp.math.utah.edu/pub/rfc/rfc1630.txt
entitled ``Universal Resource Identifiers in WWW'',
>> ...
>> ...
>> There is a conflict between the need to be able to represent many
>> characters including spaces within a URI directly, and the need to
>> be able to use a URI in environments which have limited character
>> sets or in which certain characters are prone to corruption. This
>> conflict has been resolved by use of an hexadecimal escaping
>> method which may be applied to any characters forbidden in a given
>> context. When URLs are moved between contexts, the set of
>> characters escaped may be enlarged or reduced unambiguously.
>> ...
>> CONVENTIONAL URI ENCODING SCHEME
>>
>> Where the local naming scheme uses ASCII characters which are not
>> allowed in the URI, these may be represented in the URL by a
>> percent sign "%" immediately followed by two hexadecimal digits
>> (0-9, A-F) giving the ISO Latin 1 code for that character.
>> Character codes other than those allowed by the syntax shall not
>> be used unencoded in a URI.
>>
>> REDUCED OR INCREASED SAFE CHARACTER SETS
>>
>> The same encoding method may be used for encoding characters whose
>> use, although technically allowed in a URI, would be unwise due to
>> problems of corruption by imperfect gateways or misrepresentation
>> due to the use of variant character sets, or which would simply be
>> awkward in a given environment. Because a % sign always indicates
>> an encoded character, a URI may be made "safer" simply by encoding
>> any characters considered unsafe, while leaving already encoded
>> characters still encoded. Similarly, in cases where a larger set
>> of characters is acceptable, % signs can be selectively and
>> reversibly expanded.
>> ...
-------------------------------------------------------------------------------
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- Center for Scientific Computing FAX: +1 801 585 1640, +1 801 581 4148 -
- University of Utah Internet e-mail: address@hidden -
- Department of Mathematics, 110 LCB address@hidden address@hidden -
- 155 S 1400 E RM 233 address@hidden -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe -
-------------------------------------------------------------------------------
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden