[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: better error reporting with bison and flex
From: |
Guillaume Rousse |
Subject: |
Re: better error reporting with bison and flex |
Date: |
Tue, 27 Jun 2006 10:07:28 +0200 |
User-agent: |
Thunderbird 1.5.0.4 (X11/20060618) |
Tim Van Holder wrote:
> Guillaume Rousse wrote:
>> Which is really not meaningful for user :/ I'd prefer to at least
>> display current line, and avoid refering to internal grammar symbols.
>
> Part 2 - not referring to internal grammar symbols: use string aliases
> for your tokens, like so:
>
> %token <sometype> MY_IDENTIFIER "identifier"
> %token <sometype> MY_FUNC_ABS "abs()"
>
> The verbose error reporting will use those friendlier names, and you can
> use them in your grammar too, if you like:
>
> delete_statement : "DELETE" "identifier" "FROM" table_name "." ;
OK, thanks for the tip. It helped me realized I could also use litteral
character tokens in the grammar too.
> Part 1: showing the current line
>
> There's two ways, really:
> - if you always read from a file, and know the file name, you could
> open the file, skip (yylineno - 1) lines, read a line and show it.
> This is suboptimal and only possible in rare cases (usually you don't
> know what you're reading from), but doesn't require any scanner
> changes.
Unfortunatly, messages are read from streams, not files.
> - start flex in a special state (say, BUFFER_LINE), in which an entire
> line is matched. Save yytext in your parser context. Then put the
> entire line back and switch to the INITIAL state. All that then
> remains is to switch to the BUFFER_LINE state when you scan a newline.
> (Even cleaner would be to enable a state stack in flex, and push the
> BUFFER_LINE at the start of a file and when a newline is seen; that
> way it doesn't interfere with any other states you might be using).
>
> Example (consider this pseudo-code; it's a heavily stripped and
> C++-to-C-ified version of one of my own scanners, and I may have
> stripped too much):
>
> % // start of flex rules
>
> {
> if (context.line == 0) { // Fresh file - set up the initial state
> context->line = context->column = 1;
> BEGIN(INITIAL);
> yy_push_state(BUFFER_LINE);
> }
> }
>
> <BUFFER_LINE>^[^\r\n]*/{EOL}? { // Get & save entire line
> context->current_line = strdup (yytext);
> yy_pop_state ();
> yyless (0);
> yy_set_bol ();
> }
>
> <*>{EOL} {
> context->column = 1;
> ++context->line;
> yy_push_state(BUFFER_LINE);
> }
>
> %
>
> void
> yyerror(parser_context_t* context, const char* message)
> {
> ++context->parse_errors;
> if (yytext[0] == '\0')
> complain(context->line, context->column - yyleng,
> format(_("%s at empty token"), message));
> else
> complain(context->line, context->column - yyleng,
> format(_("%s at token `%s'"), message,
> ((yytext[0] == '\n') ? _("[end-of-line]") :
> yytext)));
> maybe_show_context_line(context);
> }
>
> Note that context is a structure that's set up as a lex & parse
> parameter (to avoid global variables).
> maybe_show_context_line() only prints something if current_line is set,
> and adds markers based on the column number and yyleng.
I understand the idea. Unfortunatly, line-based scanning is made
impossible by the fact that I have binary blobs embedded in text
streams, that may interfer with line delimiters :/ I'll have to keep
with standard bison error messages here.
--
Guillaume Rousse
Projet Estime, INRIA
Domaine de Voluceau
Rocquencourt - B.P. 105
78153 Le Chesnay Cedex - France