lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] Extract links from html with application/ld+json script


From: David Woolley
Subject: Re: [Lynx-dev] Extract links from html with application/ld+json script
Date: Sun, 17 Dec 2023 20:44:20 +0000
User-agent: Mozilla Thunderbird

On 17/12/2023 19:31, Super Bonaci via Lynx-dev wrote:
Lynx is not able to extract most html links inside the html file.


There are no HTML links in 9ed7a8bb (no anchor elements, and all occurrences of href are either in link elements, which don't generate visible hyperlinks, inline, except for one, which is in javascript code)! I think this is a Javascript application program, not an HTML document. Lynx doesn't have a javascript interpreter and doesn't parse HTML in a way that creates a document object model in a format that would allow such an interpreter to do anything non-trivial.

Any links are created by manipulating the document in the browser, which Lynx can't do.

Supporting javascript applications would require a complete rewrite from first principles. The result would not be Lynx.

I suspect the same is true of the other document.

Since the Lynx version is from 2018

I don't think there have been major changes in HTML in the last five years that would break a real HTML document on Lynx. The problem with web applications is over a decade old. It goes back to the original Netscape, but was solidified when the Web Hypertext Applications Technology working group effectively took over control of HTML from W3C leading to the creation of HTML5. Although that can be used for pure documents, the name of the working group clearly indicates that the intention was otherwise. That happened about 19 years ago.

Commercial artists and marketing managers, don't buy into the TBL notion of HTML and want programs that can be run on the advertising consumer's machine. Whilst there are some cases where this is valid, for technical, or privacy reasons, most such applications are written for marketing reasons.

Some text mode browsers handle some javascript uses, but I'm pretty sure they would not cope with your examples.

The only certain way of finding the links in javascript code is run the program.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]