lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Lynx character entity references fix


From: Klaus Weide
Subject: Re: lynx-dev Lynx character entity references fix
Date: Sun, 7 Mar 1999 08:22:56 -0600 (CST)

On Fri, 5 Mar 1999, Leonid Pauzner wrote:

> >      * From: Jacob Poon <address@hidden>
> 
> > This patch does the following:
> 
> >         HTML 4.0 compliance:
> >         - Added support for Euro currency symbol.
> >         - Fixed duplicated &loz; definitions.
> 
> >         Fixes:
> >         - Fixed some typos in the old references. (fixed: b.delta)
> Thanks, I'm now working on old-style entities code, will integrate your fix.
> 
> But probably a wrong point taken:
> the table much wider than HTML 4.0,
> see Lynx /test/sgml.html (both rendered and as source) -
> it have sometimes up to four synonyms while HTML4.0 have 1:1 mapping.
> Few old references were added for compatibility with old lynx (2.8 and before)
> are from HTMLDTD.c entities[] table, nothing similar to b.greekSomething
> (nor in in HTML 4.0 nor is rendered by lynx also)...

It seems none of the b.something entities can work, because the dot
terminates entity parsing.  Are these even *meant* to be used in HTML
(of any version)?  Does Lynx use the wrong syntax for recognizing
character entities, *are* dots allowed in their names?

> We should probably decide whether we want lynx act strictly as HTML 4.0
> and reject everything else or keep as much as possible. Any vote?

No vote, but some points to consider:

  - entities with dots don't work as noted above
  - the more unnecessary names *are* recognized, the higher is the
    chance of confusion with "&something" within URLs (although the
    workaround of skipping entity strings if followed by '=' seems
    to work well - but it's still a workaround)
  - 6 ways to say "GREEK SMALL LETTER EPSILON" just seems too much;
    apart from that, is it definitively clear that they really *are*
    the same character? (Are the variations variant glyph shapes of
    "the same" character, or does Unicode more than one code point for
    them?  As for example is the case for THETA SYMBOL (&thetasym;/&thetav;)
    vs. CAPITAL/SMALL LETTER THETA.


   Klaus

reply via email to

[Prev in Thread] Current Thread [Next in Thread]