[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Do not allow identifiers that start with a negative number.
From: |
Joel E. Denny |
Subject: |
Re: [PATCH] Do not allow identifiers that start with a negative number. |
Date: |
Sat, 29 Jan 2011 15:04:43 -0500 (EST) |
User-agent: |
Alpine 2.00 (DEB 1167 2008-08-23) |
On Thu, 27 Jan 2011, Joel E. Denny wrote:
> > 3.
> > > An identifier can be any sequence of letters, underscores, periods,
> > > dashes, and digits that does not start with a digit or dash.
> Based on yesterday's replies from Alex and Akim, it sounds like there's
> strong agreement to revert Paul Eggert's patch (bf3e44) rather than trying
> to apply additional patches on top of it.
I pushed that revert.
> writing a patch to implement #3, documenting the change, and updating
> and extending the test suite should be straight-forward. Unless someone
> else desires to do it (if so, let us know), I'll propose a patch,
> perhaps this weekend.
A patch against master is below, but it also belongs on branch-2.5. I'll
wait for feedback before pushing to either.
>From 82f3355eaf8d5988391021262dc9acfa6485c098 Mon Sep 17 00:00:00 2001
From: Joel E. Denny <address@hidden>
Date: Sat, 29 Jan 2011 12:54:28 -0500
Subject: [PATCH] Do not allow identifiers that start with a dash.
This cleans up our previous fixes for a bug whereby Bison
discarded `.field' in `$-1.field'. The previous fixes were less
restrictive about where a dash could appear in an identifier, but
the restrictions were hard to explain. That bug was reported and
this final fix was originally suggested by Paul Hilfinger. This
also fixes a remaining bug reported by Paul Eggert whereby Bison
parses `%token ID -123' as `%token ID - 123' and handles `-' as an
identifier. Now, `-' cannot be an identifier. Discussed in
threads beginning at
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00000.html>,
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00004.html>.
* NEWS (2.5): Update entry describing the dash extension to
grammar symbol names. Also, move that entry before the named
references entry because the latter mentions the former.
* doc/bison.texinfo (Symbol): Update documentation for symbol
names. As suggested by Paul Eggert, mention the effect of periods
and dashes on named references.
(Decl Summary): Update documentation for unquoted %define values,
which, as a side effect, can no longer start with dashes either.
* src/scan-code.l (id): Implement.
* src/scan-gram.l (id): Implement.
* tests/actions.at (Exotic Dollars): Extend test group to exercise
bug reported by Paul Hilfinger.
* tests/input.at (Symbols): Update test group, and extend to
exercise bug reported by Paul Eggert.
* tests/named-refs.at (Stray symbols in brackets): Update test
group.
($ or @ followed by . or -): Likewise.
* tests/regression.at (Invalid inputs): Likewise.
---
ChangeLog | 33 +++++++++++++++++++++++++++++++++
NEWS | 16 ++++++++--------
doc/bison.texinfo | 18 +++++++++---------
src/scan-code.l | 2 +-
src/scan-gram.l | 2 +-
tests/actions.at | 46 ++++++++++++++++++++++++++++++++++++++++++++++
tests/input.at | 11 +++++++----
tests/named-refs.at | 27 ++++++++++++++++-----------
tests/regression.at | 3 ++-
9 files changed, 123 insertions(+), 35 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 96f75a1..3249635 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,36 @@
+2011-01-29 Joel E. Denny <address@hidden>
+
+ Do not allow identifiers that start with a dash.
+ This cleans up our previous fixes for a bug whereby Bison
+ discarded `.field' in `$-1.field'. The previous fixes were less
+ restrictive about where a dash could appear in an identifier, but
+ the restrictions were hard to explain. That bug was reported and
+ this final fix was originally suggested by Paul Hilfinger. This
+ also fixes a remaining bug reported by Paul Eggert whereby Bison
+ parses `%token ID -123' as `%token ID - 123' and handles `-' as an
+ identifier. Now, `-' cannot be an identifier. Discussed in
+ threads beginning at
+ <http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00000.html>,
+ <http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00004.html>.
+ * NEWS (2.5): Update entry describing the dash extension to
+ grammar symbol names. Also, move that entry before the named
+ references entry because the latter mentions the former.
+ * doc/bison.texinfo (Symbol): Update documentation for symbol
+ names. As suggested by Paul Eggert, mention the effect of periods
+ and dashes on named references.
+ (Decl Summary): Update documentation for unquoted %define values,
+ which, as a side effect, can no longer start with dashes either.
+ * src/scan-code.l (id): Implement.
+ * src/scan-gram.l (id): Implement.
+ * tests/actions.at (Exotic Dollars): Extend test group to exercise
+ bug reported by Paul Hilfinger.
+ * tests/input.at (Symbols): Update test group, and extend to
+ exercise bug reported by Paul Eggert.
+ * tests/named-refs.at (Stray symbols in brackets): Update test
+ group.
+ ($ or @ followed by . or -): Likewise.
+ * tests/regression.at (Invalid inputs): Likewise.
+
2011-01-24 Joel E. Denny <address@hidden>
* data/yacc.c: Fix last apostrophe warning from xgettext.
diff --git a/NEWS b/NEWS
index 593807c..0610449 100644
--- a/NEWS
+++ b/NEWS
@@ -62,6 +62,14 @@ Bison News
* Changes in version 2.5 (????-??-??):
+** Grammar symbol names can now contain non-initial dashes:
+
+ Consistently with directives (such as %error-verbose) and with
+ %define variables (e.g. push-pull), grammar symbol names may contain
+ dashes in any position except the beginning. This is a GNU
+ extension over POSIX Yacc. Thus, use of this extension is reported
+ by -Wyacc and rejected in Yacc mode (--yacc).
+
** Named references:
Historically, Yacc and Bison have supported positional references
@@ -157,14 +165,6 @@ Bison News
LAC is an experimental feature. More user feedback will help to
stabilize it.
-** Grammar symbol names can now contain dashes:
-
- Consistently with directives (such as %error-verbose) and variables
- (e.g. push-pull), grammar symbol names may include dashes in any
- position, similarly to periods and underscores. This is GNU
- extension over POSIX Yacc whose use is reported by -Wyacc, and
- rejected in Yacc mode (--yacc).
-
** %define improvements:
*** Can now be invoked via the command line:
diff --git a/doc/bison.texinfo b/doc/bison.texinfo
index cc1e064..8b96ad9 100644
--- a/doc/bison.texinfo
+++ b/doc/bison.texinfo
@@ -3123,12 +3123,13 @@ A @dfn{nonterminal symbol} stands for a class of
syntactically
equivalent groupings. The symbol name is used in writing grammar rules.
By convention, it should be all lower case.
-Symbol names can contain letters, underscores, periods, dashes, and (not
-at the beginning) digits. Dashes in symbol names are a GNU
-extension, incompatible with POSIX Yacc. Terminal symbols
-that contain periods or dashes make little sense: since they are not
-valid symbols (in most programming languages) they are not exported as
-token names.
+Symbol names can contain letters, underscores, periods, and non-initial
+digits and dashes. Dashes in symbol names are a GNU extension, incompatible
+with POSIX Yacc. Periods and dashes make symbol names less convenient to
+use with named references, which require brackets around such names
+(@pxref{Named References}). Terminal symbols that contain periods or dashes
+make little sense: since they are not valid symbols (in most programming
+languages) they are not exported as token names.
There are three ways of writing terminal symbols in the grammar:
@@ -5039,9 +5040,8 @@ Define a variable to adjust Bison's behavior.
It is an error if a @var{variable} is defined by @code{%define} multiple
times, but see @ref{Bison Options,,-D @address@hidden
address@hidden must be placed in quotation marks if it contains any
-character other than a letter, underscore, period, dash, or non-initial
-digit.
address@hidden must be placed in quotation marks if it contains any character
+other than a letter, underscore, period, or non-initial dash or digit.
Omitting @code{"@var{value}"} entirely is always equivalent to specifying
@code{""}.
diff --git a/src/scan-code.l b/src/scan-code.l
index 3dd1044..6675719 100644
--- a/src/scan-code.l
+++ b/src/scan-code.l
@@ -86,7 +86,7 @@ splice (\\[ \f\t\v]*\n)*
named symbol references. Shall be kept synchronized with
scan-gram.l "letter" and "id". */
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
-id -*(-|{letter}({letter}|[-0-9])*)
+id {letter}({letter}|[-0-9])*
ref -?[0-9]+|{id}|"["{id}"]"|"$"
%%
diff --git a/src/scan-gram.l b/src/scan-gram.l
index 4129181..83d7650 100644
--- a/src/scan-gram.l
+++ b/src/scan-gram.l
@@ -119,7 +119,7 @@ static void unexpected_newline (boundary, char const *);
%x SC_BRACKETED_ID SC_RETURN_BRACKETED_ID
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
-id -*(-|{letter}({letter}|[-0-9])*)
+id {letter}({letter}|[-0-9])*
directive %{id}
int [0-9]+
diff --git a/tests/actions.at b/tests/actions.at
index 6f267af..24c6ac8 100644
--- a/tests/actions.at
+++ b/tests/actions.at
@@ -158,6 +158,52 @@ AT_PARSER_CHECK([./input], 0,
[[15
]])
+# Make sure that fields after $n or $-n are parsed correctly. At one
+# point while implementing dashes in symbol names, we were dropping
+# fields after $-n.
+AT_DATA_GRAMMAR([[input.y]],
+[[
+%{
+# include <stdio.h>
+ static int yylex (void);
+ static void yyerror (char const *msg);
+ typedef struct { int val; } stype;
+# define YYSTYPE stype
+%}
+
+%%
+start: one two { $$.val = $1.val + $2.val; } sum ;
+one: { $$.val = 1; } ;
+two: { $$.val = 2; } ;
+sum: { printf ("%d\n", $0.val + $-1.val + $-2.val); } ;
+
+%%
+
+static int
+yylex (void)
+{
+ return 0;
+}
+
+static void
+yyerror (char const *msg)
+{
+ fprintf (stderr, "%s\n", msg);
+}
+
+int
+main (void)
+{
+ return yyparse ();
+}
+]])
+
+AT_BISON_CHECK([[-o input.c input.y]])
+AT_COMPILE([[input]])
+AT_PARSER_CHECK([[./input]], [[0]],
+[[6
+]])
+
AT_CLEANUP
diff --git a/tests/input.at b/tests/input.at
index f223a33..8a71ff6 100644
--- a/tests/input.at
+++ b/tests/input.at
@@ -653,17 +653,20 @@ AT_BISON_CHECK([-o input.c input.y])
AT_COMPILE([input.o], [-c input.c])
-# Periods and dashes are genuine letters, they can start identifiers.
-# Digits cannot.
+# Periods are genuine letters, they can start identifiers.
+# Digits and dashes cannot.
AT_DATA_GRAMMAR([input.y],
[[%token .GOOD
-GOOD
1NV4L1D
+ -123
%%
-start: .GOOD -GOOD
+start: .GOOD GOOD
]])
AT_BISON_CHECK([-o input.c input.y], [1], [],
-[[input.y:11.10-16: invalid identifier: `1NV4L1D'
+[[input.y:10.10: invalid character: `-'
+input.y:11.10-16: invalid identifier: `1NV4L1D'
+input.y:12.10: invalid character: `-'
]])
AT_CLEANUP
diff --git a/tests/named-refs.at b/tests/named-refs.at
index 74549c6..3c7b072 100644
--- a/tests/named-refs.at
+++ b/tests/named-refs.at
@@ -446,13 +446,14 @@ AT_SETUP([Stray symbols in brackets])
AT_DATA_GRAMMAR([test.y],
[[
%%
-start: foo[ /* aaa */ *&-+ ] bar
+start: foo[ /* aaa */ *&-.+ ] bar
{ s = $foo; }
]])
AT_BISON_CHECK([-o test.c test.y], 1, [],
[[test.y:11.23: invalid character in bracketed name: `*'
test.y:11.24: invalid character in bracketed name: `&'
-test.y:11.26: invalid character in bracketed name: `+'
+test.y:11.25: invalid character in bracketed name: `-'
+test.y:11.27: invalid character in bracketed name: `+'
]])
AT_CLEANUP
@@ -570,23 +571,27 @@ AT_DATA([[test.y]],
%%
start:
.field { $.field; }
-| -field { @-field; }
| 'a' { @.field; }
-| 'a' { $-field; }
;
.field: ;
--field: ;
]])
AT_BISON_CHECK([[test.y]], [[1]], [],
[[test.y:4.12-18: invalid reference: `$.field'
test.y:4.13: syntax error after `$', expecting integer, letter, `_',
`@<:@', or `$'
test.y:4.3-8: possibly meant: $[.field] at $1
-test.y:5.12-18: invalid reference: address@hidden'
+test.y:5.12-18: invalid reference: address@hidden'
test.y:5.13: syntax error after `@', expecting integer, letter, `_',
`@<:@', or `$'
-test.y:5.3-8: possibly meant: @[-field] at $1
-test.y:6.12-18: invalid reference: address@hidden'
-test.y:6.13: syntax error after `@', expecting integer, letter, `_',
`@<:@', or `$'
-test.y:7.12-18: invalid reference: `$-field'
-test.y:7.13: syntax error after `$', expecting integer, letter, `_',
`@<:@', or `$'
+]])
+AT_DATA([[test.y]],
+[[
+%%
+start:
+ 'a' { $-field; }
+| 'b' { @-field; }
+;
+]])
+AT_BISON_CHECK([[test.y]], [[0]], [],
+[[test.y:4.9: warning: stray `$'
+test.y:5.9: warning: stray `@'
]])
AT_CLEANUP
diff --git a/tests/regression.at b/tests/regression.at
index a49d1df..5e0ead0 100644
--- a/tests/regression.at
+++ b/tests/regression.at
@@ -392,7 +392,8 @@ input.y:3.14: invalid character: `}'
input.y:4.1: invalid character: `%'
input.y:4.2: invalid character: `&'
input.y:5.1-17: invalid directive: `%a-does-not-exist'
-input.y:6.1-2: invalid directive: `%-'
+input.y:6.1: invalid character: `%'
+input.y:6.2: invalid character: `-'
input.y:7.1-8.0: missing `%}' at end of file
input.y:7.1-8.0: syntax error, unexpected %{...%}
]])
--
1.7.0.4
- Re: [PATCH] Do not allow identifiers that start with a negative number., (continued)
- Re: [PATCH] Do not allow identifiers that start with a negative number., Joel E. Denny, 2011/01/24
- Re: [PATCH] Do not allow identifiers that start with a negative number., Paul Eggert, 2011/01/25
- Re: [PATCH] Do not allow identifiers that start with a negative number., Joel E. Denny, 2011/01/25
- Re: [PATCH] Do not allow identifiers that start with a negative number., Paul Eggert, 2011/01/25
- Re: [PATCH] Do not allow identifiers that start with a negative number., Joel E. Denny, 2011/01/25
- Re: [PATCH] Do not allow identifiers that start with a negative number., Akim Demaille, 2011/01/26
- Re: [PATCH] Do not allow identifiers that start with a negative number., Akim Demaille, 2011/01/26
- Re: [PATCH] Do not allow identifiers that start with a negative number., Akim Demaille, 2011/01/26
- Re: [PATCH] Do not allow identifiers that start with a negative number., Akim Demaille, 2011/01/26
- Re: [PATCH] Do not allow identifiers that start with a negative number., Joel E. Denny, 2011/01/27
- Re: [PATCH] Do not allow identifiers that start with a negative number.,
Joel E. Denny <=
- Re: [PATCH] Do not allow identifiers that start with a negative number., Alex Rozenman, 2011/01/26