[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.
From: |
Peter Bex |
Subject: |
Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri. |
Date: |
Wed, 16 Jan 2013 20:51:48 +0100 |
User-agent: |
Mutt/1.4.2.3i |
On Tue, Jan 15, 2013 at 02:44:08PM +0900, Alex Shinn wrote:
> This result looks broken. As I noted in my previous mail, the URI
> representation already handles non-ASCII characters and escapes on output:
>
> $ csi -R uri-common
> #;1> (make-uri scheme: "http" host: "127.0.0.1" path: '(/ "삼계탕"))
> #<URI-common: scheme="http" port=#f host="127.0.0.1" path=(/ "삼계탕")
> query=#f fragment=#f>
> #;2> (uri->string (make-uri scheme: "http" host: "127.0.0.1" path: '(/
> "삼계탕")))
> "http://127.0.0.1/82%BCB3%8483%95"
>
> Unrelated, the actual escaped output looks buggy - it looks like
> some characters like the leading "%EC%" are getting dropped.
OK, I took some time to investigate and I pinpointed this problem.
This appears to happen due to the use of core srfi-14 and srfi-13 in
uri-generic; its char-set operations simply don't deal with anything
beyond ASCII. Only by switching to the UTF versions utf8-srfi-14,
utf8-srfi-13 and unicode-char-sets this works:
Without patch:
$ csi -R uri-generic -P '(uri-encode-string "삼계탕")'
"�%82%BC�%B3%84�%83%95"
With patch:
$ csi -R uri-generic -P '(uri-encode-string "삼계탕")'
"%EC%82%BC%EA%B3%84%ED%83%95"
Ivan, what do you think about adding the UTF8 dependency, as per the
attached patch (against trunk)?
Cheers,
Peter
--
http://sjamaan.ath.cx
uri-generic-utf8.patch
Description: Text document
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., (continued)
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Ivan Raikov, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/15
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/16
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.,
Peter Bex <=
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Ivan Raikov, 2013/01/16
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/17
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Ivan Raikov, 2013/01/23
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Peter Bex, 2013/01/23
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/23
- Message not available
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/23
- Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri., Alex Shinn, 2013/01/25