[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] Extract links from html with application/ld+json script
From: |
David Woolley |
Subject: |
Re: [Lynx-dev] Extract links from html with application/ld+json script |
Date: |
Sun, 17 Dec 2023 20:44:20 +0000 |
User-agent: |
Mozilla Thunderbird |
On 17/12/2023 19:31, Super Bonaci via Lynx-dev wrote:
Lynx is not able to extract most html links inside the html file.
There are no HTML links in 9ed7a8bb (no anchor elements, and all
occurrences of href are either in link elements, which don't generate
visible hyperlinks, inline, except for one, which is in javascript
code)! I think this is a Javascript application program, not an HTML
document. Lynx doesn't have a javascript interpreter and doesn't parse
HTML in a way that creates a document object model in a format that
would allow such an interpreter to do anything non-trivial.
Any links are created by manipulating the document in the browser, which
Lynx can't do.
Supporting javascript applications would require a complete rewrite from
first principles. The result would not be Lynx.
I suspect the same is true of the other document.
Since the Lynx version is from 2018
I don't think there have been major changes in HTML in the last five
years that would break a real HTML document on Lynx. The problem with
web applications is over a decade old. It goes back to the original
Netscape, but was solidified when the Web Hypertext Applications
Technology working group effectively took over control of HTML from W3C
leading to the creation of HTML5. Although that can be used for pure
documents, the name of the working group clearly indicates that the
intention was otherwise. That happened about 19 years ago.
Commercial artists and marketing managers, don't buy into the TBL notion
of HTML and want programs that can be run on the advertising consumer's
machine. Whilst there are some cases where this is valid, for
technical, or privacy reasons, most such applications are written for
marketing reasons.
Some text mode browsers handle some javascript uses, but I'm pretty sure
they would not cope with your examples.
The only certain way of finding the links in javascript code is run the
program.