LYNX-DEV Re: new Lynx SGML.c parser

lynx-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LYNX-DEV Re: new Lynx SGML.c parser

From:	Klaus Weide
Subject:	LYNX-DEV Re: new Lynx SGML.c parser
Date:	Wed, 23 Apr 1997 23:37:24 -0500 (CDT)

On Tue, 22 Apr 1997, Foteos Macrides wrote:

>       I had left P the way it was, going to HTML_start_element() for
> the end tag, instead of changing it as I did for FONT together with
> the mods for FORM, A, and the emphasis elements, because it should
> do the same things as for a P start tag, and count on the GridText.c
> paragraphing code, or my LYEnsure...Space() functions in LYCharUtils.c,
> to enforce the correct number of blank lines between that paragraph and
> whatever follows it, and case P: has a great deal of code.  You can
> simply add a case P: to HTML_end_element(), or rather, remove my
> #ifdef NOTDEFINED for it there, and simply copy all the case P: code
> from HTML_start_element().

OK, I copied some of the stuff from HTML_start_element() and modified
it a bit.  There is a problem, though: when a paragraph ends, we don't
know yet how much spaceBefore the next one wants (since it may not be
a P).  And (according to DefaultStyle.c) the empty line between
paragraphs is normally a property of the next paragraph, not the 
preceding one.

The lynx.browser.org page is an example: it has <P>...</P><UL>...
The empty line between the paragraph and the list disappears when
one removes the </P>.  I believe this is actually wrong and whether
the </P> is explicit or implicit shouldn't make any difference,
so I didn't try to reproduce this.  (Probably the code for UL in
HTML_start_element() should take care of adding space.)

> It would be better, though, to make it a
> function in LYCharUtils.c, so HTML.c doesn't creep back up to a size
> which makes some compilers choke on it.

It's not that much compared to some other cases...

>       You've created the situation where sp[0].tag_number becomes
> HTML_OPTION instead of staying HTML_SELECT, so I assume all you
> have to do in HTML_put_character() for the switch() on that stack
> element is add case HTML_OPTION: together with case HTML_SELECT:
> where it appends the character to the me->option chunk.  

Ahh yes, thanks for the tip.  Obvious now that you say it :)

> I assume
> you're calling HTML_end_element() and poping the stack for implied
> or overt OPTION end tags, so the case HTML_SELECT: code should still
> work as intended.

Yes.

Here is a test that makes your code crash, but not regular 2.7.1:

<HTML><HEAD><TITLE>BLAH</HEAD>
<BODY>
aaa bbb ccc
<FORM METHOD=MAILTO ACTION="mailto:kweide";>
xxx yyy zzz
<FORM METHOD=MAILTO ACTION="mailto:kweide";>
select:
<SELECT>
garbage
<OPTION>option 1
<OPTION>option 2
</FORM>
Garbage after /FORM
</BODY>
Garbage after /BODY
</HTML>
Garbage after /HTML

Since FORM is now handled asynchronously with SELECT, I think
you should put a test for (me->inFORM) in the HTML_end_element()
handling for SELECT (I've done that), and maybe other places.

  Klaus

;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

[Prev in Thread]

Current Thread

[Next in Thread]

Re: LYNX-DEV Internal MIME types, (continued)
- Re: LYNX-DEV pre-announcing a new Lynx SGML.c parser, Nelson Henry Eric, 1997/04/21
- Re: LYNX-DEV pre-announcing a new Lynx SGML.c parser, Foteos Macrides, 1997/04/21
  - Re: LYNX-DEV pre-announcing a new Lynx SGML.c parser, Klaus Weide, 1997/04/22
- Re: LYNX-DEV pre-announcing a new Lynx SGML.c parser, Foteos Macrides, 1997/04/22
  - LYNX-DEV Re: new Lynx SGML.c parser, Klaus Weide <=

Prev by Date: LYNX-DEV building with ncurses (lynx2-7rp and 2-7-1)
Next by Date: LYNX-DEV out of memory in GridText.c ...
Previous by thread: Re: LYNX-DEV pre-announcing a new Lynx SGML.c parser
Next by thread: LYNX-DEV canceling things: ctrl+G & ctrl+C
Index(es):
- Date
- Thread