[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: url-retrieve and utf-8
From: |
William Xu |
Subject: |
Re: url-retrieve and utf-8 |
Date: |
Thu, 07 Feb 2008 17:05:31 +0900 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.0.50 (darwin) |
Stefan Monnier <monnier@iro.umontreal.ca> writes:
> I can't remember exactly, but I think it doesn't (it just returns the
> raw undecoded bytes). url-insert-file-contents should try and obey
> "Content-Type"'s charset info, tho.
Hmm, url-insert-file-contents' implementation appears to obey
"Content-Type":
,----
| ;;;###autoload
| (defun url-insert-file-contents (url &optional visit beg end replace)
| (let ((buffer (url-retrieve-synchronously url)))
| (if (not buffer)
| (error "Opening input file: No such file or directory, %s" url))
| (if visit (setq buffer-file-name url))
| (save-excursion
| (let* ((start (point))
| (size-and-charset (url-insert buffer beg end)))
| (kill-buffer buffer)
| (when replace
| (delete-region (point-min) start)
| (delete-region (point) (point-max)))
| (unless (cadr size-and-charset)
| ;; If the headers don't specify any particular charset, use the
| ;; usual heuristic/rules that we apply to files.
| (decode-coding-inserted-region start (point) url visit beg end
replace))
| (list url (car size-and-charset))))))
`----
only it never succeeds. For example, with a header like
,----
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
`----
it could only find out "text/html", completely missing "charset" value.
It looks like the final header detecting job is fallen on
mm-decode.el. Maybe mm-decode.el's fault?
--
William
http://williamxu.net9.org