[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing entities in HTML input

From: Kacper Gutowski
Subject: Re: Parsing entities in HTML input
Date: Fri, 17 Apr 2020 19:04:35 +0200

On Thu, Apr 16, 2020 at 03:29:08PM +0200, Dr. Jürgen Sauermann wrote:
> fixed in SVN 1262.

Angle brackets are now correctly converted at the end of lines too. 

But parsing numeric entities decimally was actually correct. Now 
doing a )DUMP-HTML followed by )COPY or )LOAD changes all ampersands 
into a digit eight.  The )DUMP-HTML encodes "&" as "&" which 
is a correct, decimal representation of it.  At r1262, this gets 
incorrectly parsed as hexadecimal yielding "8".

As far as the HTML goes, ampersand could also be encoded as "&" 
(which is the most common) or hexadecimally "&" (note the "x").  
As a side note, numeric references could be of any length, not just 
two digits (it could be "&" as well), but that doesn't matter 
as long as the subset that )DUMP-HTML produces can be parsed.

> If you like the )DUMP-HTML command then you may like the ]DOXY command as 
> well:
> https://www.gnu.org/software/apl/apl.html#Section-3_002e8

Oh yes, I do.  It's pretty nice, especially for exploring how larger 
workspaces like the Toronto Toolkit work.

I just noticed that ]DOXY gets some dependencies wrong: in the Toolkit, 
there's a function "julian" which is shown to be calling "date", but in 
fact it named its right argument "date" and doesn't call the function.

The Toronto Toolkit didn't contain any ampersands ;)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]