help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bison for nlp


From: Akim Demaille
Subject: Re: bison for nlp
Date: Tue, 20 Nov 2018 20:19:43 +0100

Hi!

> Le 19 nov. 2018 à 16:09, r0ller <address@hidden> a écrit :
> 
> Hi Akim,
> 
> I managed to take the first step and get it running but it wasn’t as easy as 
> I thought.

Sorry about that :(

> First, I wanted to take the approach that the 'Simple C++ Example' 
> demonstrates in the bison manual. However, I could not figure out what my 
> yylex() should return when defining api.value.type variant.

What exactly was not clear on this?  I don’t recommend not using 
api.token.constructor, but if you don’t see this bit of the documentation:

https://www.gnu.org/software/bison/manual/bison.html#Split-Symbols

The example is:

[0-9]+   {
           yylval->emplace (text_to_int (yytext));
           return yy::parser::token::INTEGER;
         }
[a-z]+   {
           yylval->emplace (yytext);
           return yy::parser::token::IDENTIFIER;
         }


> The question is, in case of having many tokens like I do, how do I decide 
> which shall be returned, as bison now generates a symbol_type make_TOKEN() 
> for each token which I shall be able to return in yylex().

I’m not sure I understand the question.  The signature of yylex is completely 
different with and without api.token.constructor.  So it’s all or nothing, you 
can’t expect to be incremental on this aspect.

I certainly agree this is a painful migration, but I do believe you’ll find it 
is worth it.

> Though, I'd rather not put a huge switch() in yylex(). Is there any other 
> solution like defining a "dummy" token like
> 
> %token <int> INT;
> 
> whose constructor make_INT(const& int) would simply return the int passed to 
> it? Or shall I simply try to cast the integers of my tokens to symbol_type?

Once you decided to move to api.token.constructor, use make_FOO and only 
make_FOO.



> The other problem I ran into was related to the non-terminals: wherever I 
> wanted to read the value of a symbol in an action via e.g. $1, I got an error 
> about type conversion as it could not be converted any more to an integer as 
> in the C parser.

Sorry, I don’t understand what you mean.  If you declared the nterm to be an 
int, then $1 is an int.


> For this I have only one guess namely, that each non-terminal needs a %type 
> declaration like
> 
> %type <int> ENG_Con;
> 
> and even the = operator needs to be defined for it, right? So here I got 
> stuck at least with regards to api.value.type variant.

I would need more details (read: code) to understand your problem.



> Then I decided to take a step back and not to use complete symbols but split 
> symbols for a first try. This I managed to figure out and make it work but 
> with a small hack as I declared yylex as:
> 
> int yylex(int* yylval);
> 
> If I did it like:
> 
> int yylex(semantic_type* yylval);
> 
> the compiler kept complaining about not knowing semantic_type (nor 
> parser::semantic_type, nor yy::parser::semantic_type). So I took clang’s hint 
> when it said semantic_type* is aka int* and it worked.

Woot?  I doubt that you are only using ints as semantic value???

If the compiler did not know about yy::parser::semantic_type, maybe it’s 
because you used %{…%} instead of %code requires {…}.  Please, be careful when 
imitating simple.yy and the others.


> In the end, to make my hack a bit more nicer, I added %define api.value.type 
> {int}. 

So you really just have ints?  Then, yes, variants are overkill.  Unless you 
already know that at some point you will have more than ints.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]