[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] Re: progress on dev.10

From: David Woolley
Subject: Re: [Lynx-dev] Re: progress on dev.10
Date: Tue, 09 Sep 2008 21:45:44 +0100
User-agent: Thunderbird (X11/20080707)

Thomas Dickey wrote:

hmm - you're saying to ignore

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "";>

(there's no media type in the file, though the linked css files have media types associated with them)

The only media type that you can use to switch to XHTML mode is one in the real HTTP headers. Obviously, if you are accessing using ftp or locally, you will need some other heuristic, e.g. .xml or .xhtml file name extensions.

If XHTML is served with a text/html media type, it is supposed to be parsed as HTML, with / at the end of tags completely ignored (de facto HTML error recovery). It is supposed to avoid constructs that differ, e.g. one should always have an explicit tbody, as tbody is inferred in HTML, but not in XHTML. It is more or less essential that scripts and style sheets are out of line, as HTML parsers generally don't recognize CDATA section markers and XHTML parsers will resolve entities in scripts and style sheets (the common tactic of commenting out scripts to protect them from earlier browsers, really will comment them out in an XHTML parser, etc.

(Some of the above only apply to documents that use the DOM.)

Conversely, if you get a document that has an HTTP media type of application/xml+xhtml, you must apply XML parsing rules to it, even if the DOCTYPE conflicts.

Knowing the way that browser vendors think, they may well have started applying heuristics to cope with violations of Appendix C, but actually parsing the input as XHTML is likely to end up with a large number of rejected documents, as one of the worst things about Appendix C is that people trying to be fashionable have been adding XHTML DOCTYPEs to documents that are not XML.

David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]