[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to get title of web page by url?

From: Lennart Borgman
Subject: Re: How to get title of web page by url?
Date: Wed, 28 Jul 2010 17:44:45 +0200

On Wed, Jul 28, 2010 at 5:34 PM, Thamer Mahmoud
<address@hidden> wrote:
> filebat Mark <address@hidden> writes:
>> Thanks, Thamer. It works.
>> Below is the code snippet.
>> Well, I still have an encoding problem.
>> To get the title of "";, the title we get is displayed as
>> unrecognizable codes.
>> I have tried to encode it, in the way of "(setq web_title_str
>> (encode-coding-string  web_title_str 'utf-8-dos))", but it fails.
> I'm also new to Elisp (well sort of).
> But here is a modified version that should handle both charsets and
> newlines (and other issues noticed by Deniz Dogan. Thanks).
> (defun www-get-page-title (url)
>  (let ((title))
>    (with-current-buffer (url-retrieve-synchronously url)
>      (goto-char (point-min))
>      (re-search-forward "<title>\\([^<]*\\)</title>" nil t 1)
>      (setq title (match-string 1))
>      (goto-char (point-min))
>      (re-search-forward "charset=\\([-0-9a-zA-Z]*\\)" nil t 1)
>      (decode-coding-string title (intern (match-string 1))))))
> The robustness of this code would still depend on whether the HTML is
> well-formed, but it should be good enough I think.

Have a look at url-copy-file for how to get this correct. (Or
web-vcs-url-copy-file in nXhtml which is a little bit more careful.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]