emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: url-retrieve-synchronously and coding


From: Julien Danjou
Subject: Re: url-retrieve-synchronously and coding
Date: Mon, 24 Jan 2011 16:11:29 +0100
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux)

On Mon, Jan 24 2011, Lennart Borgman wrote:

> Ok, thanks. It is not easy to navigate among those functions. But I
> guess we have said before that better documentation is needed.
>
> Unfortunately url-insert-file-contents does not decode the file as
> utf-8. mm-disect-buffer looks for the charset, but only in the mime
> headers. In this case the charset is specified instead in the xml
> content.
>
> I do not know how the retrieved content above should be handled. It
> looks however like the web browsers handles this case and shows the
> xml content correctly.

Probably because your browser understand XML. Firefox seems to.

> It seems natural in a case like this where Content-Type is text/xml to
> look for the specified charset in the xml content. I think
> `url-insert' should do this. Here is a suggestion for how to do it
> where I just have added a search for <?xml encoding=...>:

Damn no, I don't think *url*-insert should parse XML, or you'll end up
parsing a lot of file type. This is not what url is about.

What you need is another layer on top of mm (or enhance mm) with
something like this:

#+begin_src emacs-lisp
(defvar mm-decoder-helper-functions
 '(("text/xml" . 'mm-decoder-xml-helper)))

(defun mm-decoder-xml-helper (string-or-buffer)
  "Return the encoding type of a XML."

This function read a XML string or a buffer containing XML (this
depends on the API type you chose to implement) and return it's encoding.

  ...)

(defun mm-decoder-please-decode-this (content content-type &optional 
content-encoding)
  "Decode CONTENT based on CONTENT-TYPE and possibly CONTENT-ENCODING."

Here you use content-encoding if provided, or a helper from 
`mm-decoder-helper-functions' to find the good content based on
`content-type'.
   ...)
#+end_src


That is just a raw idea. Feel free to enhance. :)

-- 
Julien Danjou
❱ http://julien.danjou.info

Attachment: pgpT5Pr6_7ANC.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]