|
From: | Rogelio Serrano |
Subject: | Re: GNUstep Web browser (was Re: WebKit Bounty) |
Date: | Mon, 5 Mar 2007 21:40:40 +0800 |
On 5 Mar 2007 02:53:09 -0800, hns@computer.org <hns@computer.org> wrote:
> or pass html through html tidy first. It appears unnecessary to me to go that way because it first parses HTML into a tree, then fixes some things and writes out HTML just to parse it again...
i thought that was a good trade off...
I have read through the rules html tidy uses and in most cases the following rules will have the same or a quite similar result (ok it needs more testing with badly designed pages): * if the closing tag does not match the opening tag, search outwards until you find one (if you don't find, ignore) * be lazy with missing quotes in tag attributes * convert all tag names and attribute names to upper case * ignore <html>, <head>, <body> (except for attributes) * some tags always go to the HEAD section (e.g. <title>, <meta>) wherever they appear * ignore unknown tags As soon as I have new more or less stable code, I will upload a snapshot and you can look into it. -- hns
thats a good start.
_______________________________________________ Discuss-gnustep mailing list Discuss-gnustep@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnustep
-- the thing i like with my linux pc is that i can sum up my complaints in 5 items
[Prev in Thread] | Current Thread | [Next in Thread] |