emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: address@hidden: [patch] url-hexify-string does not follow W3C spec]


From: Thien-Thi Nguyen
Subject: Re: address@hidden: [patch] url-hexify-string does not follow W3C spec]
Date: 01 Aug 2006 10:47:07 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4

YAMAMOTO Mitsuharu <address@hidden> writes:

>   [review]

thanks, that was very pleasant to read.

>   * Rev 1.14
>     The argument is assumed to be either a sequence of characters or a
>     sequence of octets depending on the multibyteness of the string.
>     Incompatibility still remains for a multibyte string containing
>     eight-bit-control or eight-bit-graphic, but usually negligible.
> 
> I'm not sure if encoding with UTF-8 is really useful, but I don't
> strongly oppose it if compatibility for the unibyte case is preverved.

conversion to utf-8 is per the RFC, which seems to be the primary context for
this function; avoiding that conversion means noncompliance w/ the RFC.

i think rev 1.14 is almost ok; anything that deviates from the RFC should be
under user control (via optional arg) and should be documented.  i assume that
(a) conversion of multibyte utf-8 is unconditionally desirable (a "negligible"
problem is no problem), and (b) that there exist non utf-8 unibyte encodings
that which callers wish to "hexify as is".  please correct me if these
assumptions do not hold.  on the other hand, if they do hold, how about:

(defun ... (string &optional unibyte-as-is-p)
   ...
   (if (or (multibyte-string-p string)
           (not unibyte-as-is-p))
       (encode-coding-string string 'utf-8 t)
     string)
   ...)

?

this way, RFC-compliance is the default, but suppressing the conversion to
utf-8 is still possible for unibyte strings by specifying UNIBYTE-AS-IS-P.

thi




reply via email to

[Prev in Thread] Current Thread [Next in Thread]