[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] lynx misrenders many *IN*valid xhtml5 pages on my site

From: Thorsten Glaser
Subject: Re: [Lynx-dev] lynx misrenders many *IN*valid xhtml5 pages on my site
Date: Mon, 12 Jun 2023 23:22:24 +0000 (UTC)

Hi again!

Lennart, you nerdsniped me.

Dixi quod…
>Lennart Jablonka dixit:
>>> I’m not sure whether it may then also self-close all tags but would
>>> assume so (except I know tech is… tricky).
>> As in an XML document, <asdf/> and <asdf></asdf> are entirely equivalent, 
>> yes,
>> the server may then “self-close” all empty elements.
>That’s what made me say I’d assume so, but I know tech, which is
>why I hesitate.

I found hints towards still requiring the empty not-self-closed
tags even in XML but I forgot where during the subsequent hacking
which took m̲u̲c̲h̲ longer than expected.

But here is that hacking’s result. Find attached an LD_PRELOAD library
that makes “xmlstarlet fo”, without -o (because it then uses yet other
libxml2 function calls), output XHTML ☻


$ sudo apt-get install libxml2-dev

Compile and link:

$ gcc -Wdate-time -D_FORTIFY_SOURCE=2 -O2 -fstack-protector-strong \
      -Wformat -Werror=format-security -Wall -Wextra \
      $(xml2-config --cflags) -DPIC -fPIC -shared -o \


$ LD_PRELOAD=$PWD/ xmlstarlet fo [-n] [-e encoding] filename|-

C̲a̲v̲e̲a̲t̲:̲ without -n it breaks up “old browser-safe” framing for CSS and 

 <style type="text/css"><!--/*--><![CDATA[/*><!--*/
 <script type="text/javascript"><!--//--><![CDATA[//><!--

This is because in XML, the <!--/*--> or <!--//--> is a
comment node inside the style/script node (as is correct)
and libxml2’s “XHTML” output code writes a newline after
each node if indenting. xhtmlNodeListDumpOutput() is
static, so not up for LD_PRELOAD hacks. But the OP was
not formatting/indenting their XML anyway so this strikes
me as a suitable postprocessing step. I did verify that it
properly adds spaces and not-self-closes elements for one
static XHTML file.

This was initially very mildly based on libxml2 itself,
whose public API sucks badly enough I had to redraft it
from the beginning. (This the reason of taking so long.)
I publish this under Ⓕ CC0.

PS: Shlomi Fish, when replying to me, please send to the list
    as your provider fails badly enough at SMTP it cannot send
    eMails directly to me :/
FWIW, I'm quite impressed with mksh interactively. I thought it was much
*much* more bare bones. But it turns out it beats the living hell out of
ksh93 in that respect. I'd even consider it for my daily use if I hadn't
wasted half my life on my zsh setup. :-) -- Frank Terbeck in #!/bin/mksh

Attachment: forceXHTML.c
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]