[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing entities in HTML input

From: Dr . Jürgen Sauermann
Subject: Re: Parsing entities in HTML input
Date: Sat, 18 Apr 2020 12:36:19 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Hi Kacper,

yes, I thought i spotted a fault but in fact I created a new one.
Fixed in SVN 1264.

I will look into the call graph. There are some limits though as to what
]DOXY can detect. In you example it could have detected that date is
a (local) variable, but in the general case (date could have been localized
outside of julian) this most likely touches on decidability.

Best Regards,

On 4/17/20 7:04 PM, Kacper Gutowski wrote:
On Thu, Apr 16, 2020 at 03:29:08PM +0200, Dr. Jürgen Sauermann wrote:
fixed in SVN 1262.
Angle brackets are now correctly converted at the end of lines too. 

But parsing numeric entities decimally was actually correct. Now 
doing a )DUMP-HTML followed by )COPY or )LOAD changes all ampersands 
into a digit eight.  The )DUMP-HTML encodes "&" as "&" which 
is a correct, decimal representation of it.  At r1262, this gets 
incorrectly parsed as hexadecimal yielding "8".

As far as the HTML goes, ampersand could also be encoded as "&" 
(which is the most common) or hexadecimally "&" (note the "x").  
As a side note, numeric references could be of any length, not just 
two digits (it could be "&" as well), but that doesn't matter 
as long as the subset that )DUMP-HTML produces can be parsed.

If you like the )DUMP-HTML command then you may like the ]DOXY command as well:

Oh yes, I do.  It's pretty nice, especially for exploring how larger 
workspaces like the Toronto Toolkit work.

I just noticed that ]DOXY gets some dependencies wrong: in the Toolkit, 
there's a function "julian" which is shown to be calling "date", but in 
fact it named its right argument "date" and doesn't call the function.

The Toronto Toolkit didn't contain any ampersands ;)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]