[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev A Missing >...
From: |
Klaus Weide |
Subject: |
Re: lynx-dev A Missing >... |
Date: |
Wed, 2 Aug 2000 16:18:20 -0500 (CDT) |
On Mon, 24 Jul 2000, Thomas E. Dickey wrote:
> On Mon, 24 Jul 2000, pAb-032871 wrote:
>
> > If someone forgets to put a ">" after their tag, it'll screw ay
> > browser up, and they're likely to fix it as soon as they find
> > out about it.
>
> I'd like to agree, but I tested the page in question with Netscape and
> w3m, and they both rendered it. (It's a recent change to w3m, since
> the first copy I tried, from January didn't handle the page at all)
They probably all resync on the next '>', not on the next '<'.
> > In "Re: lynx-dev A Missing >..."
> > [19/Jul/2000 Wed 15:14:21]
> > Thomas Dickey wrote:
> > > shouldn't the tag end when it sees a new "<", unless it's quoted?
> [...]
> no. I meant something like
> <tag value="<something">
So the question is, should something like
<tag value=<something>
be treated as
<tag value=><something>
i.e. should the '<' (as it is not within single or double quotes) close
the first tag.
This can be asked in two ways, (a) is there a correct interpretation
according to SGML rules (and if yes, which), and (b) what interpretation
makes the most sense in terms of practical error recovery.
As for (a),
as I understand it, something like
<tag value=someword<something>
can actually be correct SGML-wise, and means the same as
<tag value=someword><something>
but only if the SGML declaration that is in effect has the corresponding
FEATURE enabled (SHORTTAG YES or something more specific). Formally,
that is the case for all common "HTML" versions, but not for XML and
hence not for XHTML. For HTML 4.01, see here:
Linkname: Performance, Implementation, and Design Notes
URL: http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.7
"Documents that use them are conforming SGML documents, but are
unlikely to work with many existing HTML tools."
As a procatical matter, I think it would make sense to always let a
'<' that is not within quotes end the current tag. It's either
"correct" anyway according to SGML rules or sensible error recovery.
I think lynx currently does it already if the '<' comes directly
after the tag name (if I remember my changes right...).
All this doesn't have that much to do any more with the specific
problem about an unfinished </SCRIPT...
Klaus
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden