lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Using other HTML parsers in Lynx


From: Mooneer Salem
Subject: Re: lynx-dev Using other HTML parsers in Lynx
Date: Wed, 26 Jul 2000 10:35:28 -0700

The main library itself is just a function with a bunch of callbacks:

declCallBack <- for <!doctype> style tags
startCallBack
endCallBack  <--- for start and end tags
textStartCallBack
textCallBack
textEndCallBack
commentStartCallBack
commentCallBack
commentEndCallBack

I was thinking of making an interface where certain functions are
called based on the HTML tag as well as this generic interface.
In either case it should be easy to add support for Javascript.

----- Original Message -----
From: "pAb-032871" <address@hidden>
To: <address@hidden>
Sent: Wednesday, July 26, 2000 1:04 AM
Subject: Re: lynx-dev Using other HTML parsers in Lynx


> In "lynx-dev Using other HTML parsers in Lynx"
> [25/Jul/2000 Tue 19:30:40]
> Mooneer Salem wrote:
>
> I don't know much about programming, but some of this is of interest
> to me.
>
> > For fun, I decided to write a library that parses HTML. :)
> >
> > After writing the library I decided to write a demo application using it
> > (which prints an English representation of the HTML passed into it and
> > which also acts as a benchmark app.) According to the demonstration
> > program, it parsed the PostgreSQL-HOWTO (300kb in HTML format) in
> > 0.16 seconds, which is pretty impressive for a Celeron 500 with 192MB
RAM
> > which also acts as a pretty busy DNS, Web and database server.
> >
> > Here's the question: how hard would it be to implement this parser into
> > Lynx?
>
> Depends how integrated you want it to be.  Could be as easy as
> defining a new DOWNLOADER [your app] in lynx.cfg
>
> What really interests me is; who feels like teaching it to parse
> JavaScript as well [if you're releasing the source and leaving
> it open to changes]?
>
> People seem to think adding support for JavaScript isn't practical,
> and maybe impossibe, in Lynx because of its a one-pass HTML parsing.
> A separate application that *could* make sense of JavaScript,
> then pass the results to Lynx, popped into my head some time ago
> but I don't know how to do it.
>
> Here's an old message about it:
> http://www.flora.org/lynx-dev/html/month072000/msg00034.html
>
> The follow-ups are probably more interesting than my input.
>
>
> > The parser can be found at
> > http://devel.usnuk.net/libhtmlparse-0.1-alpha1.tar.gz,
> > and you can see the uncompressed archive at
> > http://devel.usnuk.net/libhtmlparse-0.1-alpha1/.
> > demo/test.c is the benchmark application I ran, while demo/prettyHTML is
> > another
> > demo app I wrote for the library (it makes HTML more clear and readable
by
> > adding tabs)
> >
> > --
> > Mooneer Salem
> > Sysadmin, Ultraspeed UK (http://www.ultraspeed.co.uk/)
> > GPLTrans (http://www.translator.cx/)
> > Personal Home Page (http://msalem.translator.cx/)
>
>
>
>           Patrick
> <mailto:address@hidden>
>
>
> ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden
>


; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]