[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev Using other HTML parsers in Lynx
From: |
pAb-032871 |
Subject: |
Re: lynx-dev Using other HTML parsers in Lynx |
Date: |
Wed, 26 Jul 2000 01:04:34 -0700 |
In "lynx-dev Using other HTML parsers in Lynx"
[25/Jul/2000 Tue 19:30:40]
Mooneer Salem wrote:
I don't know much about programming, but some of this is of interest
to me.
> For fun, I decided to write a library that parses HTML. :)
>
> After writing the library I decided to write a demo application using it
> (which prints an English representation of the HTML passed into it and
> which also acts as a benchmark app.) According to the demonstration
> program, it parsed the PostgreSQL-HOWTO (300kb in HTML format) in
> 0.16 seconds, which is pretty impressive for a Celeron 500 with 192MB RAM
> which also acts as a pretty busy DNS, Web and database server.
>
> Here's the question: how hard would it be to implement this parser into
> Lynx?
Depends how integrated you want it to be. Could be as easy as
defining a new DOWNLOADER [your app] in lynx.cfg
What really interests me is; who feels like teaching it to parse
JavaScript as well [if you're releasing the source and leaving
it open to changes]?
People seem to think adding support for JavaScript isn't practical,
and maybe impossibe, in Lynx because of its a one-pass HTML parsing.
A separate application that *could* make sense of JavaScript,
then pass the results to Lynx, popped into my head some time ago
but I don't know how to do it.
Here's an old message about it:
http://www.flora.org/lynx-dev/html/month072000/msg00034.html
The follow-ups are probably more interesting than my input.
> The parser can be found at
> http://devel.usnuk.net/libhtmlparse-0.1-alpha1.tar.gz,
> and you can see the uncompressed archive at
> http://devel.usnuk.net/libhtmlparse-0.1-alpha1/.
> demo/test.c is the benchmark application I ran, while demo/prettyHTML is
> another
> demo app I wrote for the library (it makes HTML more clear and readable by
> adding tabs)
>
> --
> Mooneer Salem
> Sysadmin, Ultraspeed UK (http://www.ultraspeed.co.uk/)
> GPLTrans (http://www.translator.cx/)
> Personal Home Page (http://msalem.translator.cx/)
Patrick
<mailto:address@hidden>
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden
- lynx-dev Using other HTML parsers in Lynx, Mooneer Salem, 2000/07/25
- Re: lynx-dev Using other HTML parsers in Lynx,
pAb-032871 <=
- Re: lynx-dev Using other HTML parsers in Lynx, Mooneer Salem, 2000/07/26
- Re: lynx-dev Using other HTML parsers in Lynx, Vlad Harchev, 2000/07/27
- Lynx-specific tags? (Was: Re: lynx-dev Using other HTML parsers in Lynx), Mooneer Salem, 2000/07/27
- lynx-dev No Need For Proprietary Markup, Mark Canlas, 2000/07/27
- Re: Lynx-specific tags? (Was: Re: lynx-dev Using other HTML parsers in Lynx), pAb-032871, 2000/07/28
- Re: Lynx-specific tags? (Was: Re: lynx-dev Using other HTML parsers in Lynx), Vlad Harchev, 2000/07/28
- lynx-dev LINK is NOT ancient, Mark Canlas, 2000/07/28
- Re: lynx-dev LINK is NOT ancient, David Woolley, 2000/07/30
- Re: Lynx-specific tags? (Was: Re: lynx-dev Using other HTML parsers in Lynx), David Woolley, 2000/07/29