|
From: | Dave Trombley |
Subject: | Parsing input from a stream... [Was: Re: Parsing input from a string...] |
Date: | Thu, 31 Jan 2002 14:47:30 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2) Gecko/20010628 |
John W. Millaway wrote:
I've downloaded the devloper's pre-release from your website, and I've been playing with that.In the next release, you will be able to change the initial buffer size. Currently, people do this with sed/perl by redefining YY_BUF_SIZE, which is by default 16k.
I suppose what I'm interested in doing is to understand how the yy_*buffer* functions work. Correct me if I'm mistaken, but they seem to assume to a large degree that files will be the underlying data source for the buffers (although there are functions for specifically copying strings into a new buffer), and more broadly, that all of the data will be avaialable by the time the lexer entry point is reached. What I'd really like to be able to do is to have a parser/lexer pair which is fully reentrant, and have the lexer drain a stream until either the parse terminates, or the stream is empty. In the latter case, I'd like the parser/lexer to block on stream input until more is available. It seems I could implement this in 2.5.6, especially given the fact that you can pass extra data along in a reentrant lexer, but I'm having trouble because I don't know the exact contract for the buffer functions. (For example, should I assume that YY_INPUT will only ever be called from a single place? How can I access the extra data from that place? Is that data placed into a flex buffer? Is there a more low level way of getting input to the lexer, since my MT buffers will be around anyway?)What do you want to know about the buffering? By default, flex requests as many bytes "up front" as it can get, or as much as it needs to match something. Obviously, you can change this behavior by returning a different # of bytesfrom YY_INPUT than flex requested.
Are there any plans/thoughts about making the buffer system extensible and abstract? Do you think it would be possible/desirable for me to attempt this, and could it be done without sacrificing performance?
Cheers, -dj
[Prev in Thread] | Current Thread | [Next in Thread] |