chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] html->sxml (html-parser egg) does not decode entitie


From: Matt Gushee
Subject: Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?
Date: Tue, 3 Sep 2013 23:06:13 -0600

On Tue, Sep 3, 2013 at 8:51 PM, Alex Shinn <address@hidden> wrote:

> html-parser processes entities, but the default for html->sxml
> is just to leave the encoded as-is.  I'm not sure if that's the best
> default,

I'm not going to suggest that this is a major problem, especially
since you are not claiming html-parser conforms to any particular
standard, and the docs clearly indicate its pragmatic focus. But just
for the record, if you wanted to be an XML-1.1-conformant processor,
you would have to normalize attribute values, which includes
dereferencing character entities:

http://www.w3.org/TR/xml11/#AVNormalize

As for the non-XML varieties of HTML, well ... life is too short to go
digging into all that hoary SGML stuff. Did that once upon a time ...
but I was younger then, and thought markup languages were the greatest
thing since sliced bread ;-)

--
Matt Gushee



reply via email to

[Prev in Thread] Current Thread [Next in Thread]