help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re:   and nXML mode


From: Yuri Khan
Subject: Re:   and nXML mode
Date: Mon, 9 Aug 2021 12:57:50 +0700

On Mon, 9 Aug 2021 at 09:22, Jean-Christophe Helary
<lists@traduction-libre.org> wrote:

> Is there a reason why nXML mode refuses to consider &nbsp; entities as legit 
> in a document that starts with:
>
> <!DOCTYPE html>
> <html xmlns="http://www.w3.org/1999/xhtml";>

If you view that as an XML document (which is what nXML deals with),
without any preconceived knowledge of HTML5, there is nothing to
suggest that &nbsp; is legit.

In XML, an entity can be defined inline within the doctype declaration:

    <!DOCTYPE html [
      <!ENTITY nbsp "&#a0;">
    ]>

or by reference to an external entity definition:

    <!DOCTYPE html
      PUBLIC "-//W3C//DTD XHTML 1.1//EN"
      SYSTEM "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>

(In the HTML5 spec, this is referred to as “obsolete permitted DOCTYPE
string”, and the obsoletion is from the HTML5 point of view. I.e. if
you use an HTML5-aware parser, <!DOCTYPE html> is sufficient to
declare an HTML5 document.)

If you fetch that url, you will see that it references a number of
modules, and if you chase references far enough, you will get to
http://www.w3.org/MarkUp/DTD/xhtml-lat1.ent which contains this as its
first significant line:

    <!ENTITY nbsp   "&#160;" ><!-- no-break space = non-breaking
space, U+00A0 ISOnum -->

and that’s what makes &nbsp; a valid entity reference in an XHTML document.

(XML processors normally have some shortcuts, such as DTD pre-cached
in the so-called XML catalog, so that they don’t have to fetch them
from the network each time. XML catalog is keyed by the PUBLIC and/or
SYSTEM identifiers but not by the doctype root element name.)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]