[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
FYI: alias undeclared or declared after use
From: |
Joel E. Denny |
Subject: |
FYI: alias undeclared or declared after use |
Date: |
Fri, 18 Aug 2006 06:26:14 -0400 (EDT) |
Bison incorrectly assumed that string literals would not be declared as
aliases after use. Doing so caused an assertion failure. Below is a
committed patch that checks and fixes this bug.
As part of the fix, Bison now reports an error for any string literal that
is never declared as an alias. I just don't see the need for it in a real
parser: there's no token number to return from the scanner. I figure this
was just an accident... although a few test cases exploited it. If
necessary, I believe that making undeclared string literals possible again
would only require a little more work.
Joel
Index: ChangeLog
===================================================================
RCS file: /sources/bison/bison/ChangeLog,v
retrieving revision 1.1550
diff -p -u -r1.1550 ChangeLog
--- ChangeLog 14 Aug 2006 22:40:33 -0000 1.1550
+++ ChangeLog 18 Aug 2006 10:05:17 -0000
@@ -1,3 +1,28 @@
+2006-08-18 Joel E. Denny <address@hidden>
+
+ Don't allow an undeclared string literal, but allow a string literal to
+ be used before its declaration.
+ * src/reader.c (check_and_convert_grammar): Don't invoke packgram if
+ symbols_pack complained.
+ * src/symtab.c (symbol_new): Don't count a string literal as a new
+ symbol.
+ (symbol_class_set): Don't count a string literal as a new token, and
+ don't assign it a symbol number since symbol_make_alias does that.
+ (symbol_make_alias): It's not necessary to decrement the symbol and
+ token counts anymore. Don't assume that an alias declaration occurs
+ before any uses of the identifier or string, and thus don't assert that
+ one of them has the highest symbol number so far.
+ (symbol_check_alias_consistency): Complain if there's a string literal
+ that wasn't declared as an alias.
+ (symbols_pack): Bail if symbol_check_alias_consistency failed since
+ symbol_pack asserts that every token has been assigned a symbol number
+ although undeclared string literals have not.
+ * tests/regression.at (String alias declared after use, Undeclared
+ string literal): New test case.
+ (Characters Escapes, Web2c Actions): Declare string literals as
+ aliases.
+ * tests/sets.at (Firsts): Likewise.
+
2006-08-14 Joel E. Denny <address@hidden>
In the grammar scanner, STRING_FINISH unclosed constructs and return
Index: src/reader.c
===================================================================
RCS file: /sources/bison/bison/src/reader.c,v
retrieving revision 1.267
diff -p -u -r1.267 reader.c
--- src/reader.c 29 Jul 2006 05:53:41 -0000 1.267
+++ src/reader.c 18 Aug 2006 10:05:17 -0000
@@ -630,7 +630,8 @@ check_and_convert_grammar (void)
symbols_pack ();
/* Convert the grammar into the format described in gram.h. */
- packgram ();
+ if (!complaint_issued)
+ packgram ();
/* The grammar as a symbol_list is no longer needed. */
LIST_FREE (symbol_list, grammar);
Index: src/symtab.c
===================================================================
RCS file: /sources/bison/bison/src/symtab.c,v
retrieving revision 1.75
diff -p -u -r1.75 symtab.c
--- src/symtab.c 14 Aug 2006 00:34:17 -0000 1.75
+++ src/symtab.c 18 Aug 2006 10:05:17 -0000
@@ -79,7 +79,8 @@ symbol_new (uniqstr tag, location loc)
if (nsyms == SYMBOL_NUMBER_MAXIMUM)
fatal (_("too many symbols in input grammar (limit is %d)"),
SYMBOL_NUMBER_MAXIMUM);
- nsyms++;
+ if (tag[0] != '"')
+ nsyms++;
return res;
}
@@ -266,7 +267,8 @@ symbol_class_set (symbol *sym, symbol_cl
if (class == nterm_sym && sym->class != nterm_sym)
sym->number = nvars++;
- else if (class == token_sym && sym->number == NUMBER_UNDEFINED)
+ else if (class == token_sym && sym->number == NUMBER_UNDEFINED
+ && sym->tag[0] != '"')
sym->number = ntokens++;
sym->class = class;
@@ -361,12 +363,7 @@ symbol_make_alias (symbol *sym, symbol *
sym->user_token_number = USER_NUMBER_ALIAS;
symval->alias = sym;
sym->alias = symval;
- /* sym and symval combined are only one symbol. */
- nsyms--;
- ntokens--;
- assert (ntokens == sym->number || ntokens == symval->number);
- sym->number = symval->number =
- (symval->number < sym->number) ? symval->number : sym->number;
+ symval->number = sym->number;
symbol_type_set (symval, sym->type_name, loc);
}
}
@@ -383,6 +380,9 @@ symbol_check_alias_consistency (symbol *
symbol *alias = this;
symbol *orig = this->alias;
+ if (this->tag[0] == '"' && !this->alias)
+ complain_at (this->location, _("%s undeclared"), this->tag);
+
/* Check only those that _are_ the aliases. */
if (!(this->alias && this->user_token_number == USER_NUMBER_ALIAS))
return;
@@ -723,6 +723,8 @@ symbols_pack (void)
symbols = xcalloc (nsyms, sizeof *symbols);
symbols_do (symbol_check_alias_consistency_processor, NULL);
+ if (complaint_issued)
+ return;
symbols_do (symbol_pack_processor, NULL);
symbols_token_translations_init ();
Index: tests/regression.at
===================================================================
RCS file: /sources/bison/bison/tests/regression.at,v
retrieving revision 1.105
diff -p -u -r1.105 regression.at
--- tests/regression.at 14 Aug 2006 20:51:33 -0000 1.105
+++ tests/regression.at 18 Aug 2006 10:05:18 -0000
@@ -489,7 +489,9 @@ AT_DATA_GRAMMAR([input.y],
void yyerror (const char *s);
int yylex (void);
%}
-[%%
+[%token QUOTES "\""
+%token TICK "'"
+%%
exp:
'\'' "\'"
| '\"' "\""
@@ -700,6 +702,10 @@ statement: struct_stat;
struct_stat: /* empty. */ | if else;
if: "if" "const" "then" statement;
else: "else" statement;
+%token IF "if";
+%token CONST "const";
+%token THEN "then";
+%token ELSE "else";
%%
]])
@@ -1108,3 +1114,48 @@ Stack now 0
]])
AT_CLEANUP
+
+
+
+## --------------------------------- ##
+## String alias declared after use. ##
+## --------------------------------- ##
+
+AT_SETUP([String alias declared after use])
+
+# Bison once incorrectly asserted that the symbol number for either a token or
+# its alias was the highest symbol number so far at the point of the alias
+# declaration. That was true unless the declaration appeared after their first
+# uses.
+
+AT_DATA([input.y],
+[[%%
+start: 'a' "A" 'b';
+%token 'a' "A";
+]])
+
+AT_CHECK([bison -t -o input.c input.y])
+
+AT_CLEANUP
+
+
+
+## --------------------------- ##
+## Undeclared string literal. ##
+## --------------------------- ##
+
+AT_SETUP([Undeclared string literal])
+
+# Bison once allowed a string literal to be used in the grammar without any
+# declaration assigning it as an alias of another token.
+
+AT_DATA([input.y],
+[[%%
+start: "abc";
+]])
+
+AT_CHECK([bison -t -o input.c input.y], [1], [],
+[[input.y:2.8-12: "abc" undeclared
+]])
+
+AT_CLEANUP
Index: tests/sets.at
===================================================================
RCS file: /sources/bison/bison/tests/sets.at,v
retrieving revision 1.21
diff -p -u -r1.21 sets.at
--- tests/sets.at 10 Dec 2005 00:25:27 -0000 1.21
+++ tests/sets.at 18 Aug 2006 10:05:18 -0000
@@ -196,6 +196,7 @@ AT_DATA([input.y],
[[%nonassoc '<' '>'
%left '+' '-'
%right '^' '='
+%token EXP "exp"
%%
exp:
exp '<' exp
- FYI: alias undeclared or declared after use,
Joel E. Denny <=