help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bison for nlp


From: r0ller
Subject: Re: bison for nlp
Date: Wed, 21 Nov 2018 11:07:46 +0100 (CET)

Hi Akim,

Thanks again! After your answer I felt pretty dumb as I had to realize that the 
problems I mentioned came most probably due to my carelessness. For example, I 
did not define api.token.constructor only api.value.type variant when I 
followed the Simple C++ Example and on top of that I did not pay attention to 
that the code in the prologue shall be put in %code{...} sections. As you 
pointed out, the compiler did not recognize yy::parser::semantic_type as I had 
my code simply in the prologue i.e. %{…%}. So most probably that's why I got 
all those errors. This is actually one thing that could be pointed out in the 
Simple C++ Example. The other thing that could be added to the C++ part is a 
Simplest C++ (migration) Example for those (like me) who'd migrate from a C 
parser using ints just to show that there's almost nothing to do to get a c++ 
parser. The Simplest C++ Example is fine for those who have a C parser with 
unions as far as I can see. To answer your question concerning using ints: it 
has historical reasons. When I started the project (with yacc) I was not 
familiar with parser generators (and am still not as knowledgeable as I'd like 
to be) so not knowing what is possible and what not, I created a simple parser 
design and coded the rest in c (later c++) to get the functionality I needed. 
But the variants will be a great help once I manage to fulfill all the 
prerequisites as the migration path for me has just begun:) I don't want to 
turn everything upside down in one step. So I first need to get rid of 
numbering the tokens and then I'll see what comes afterwards. Anyway, now I'm 
already happy that I have a c++ parser and 5 warnings less when compiling the 
project. I'm pretty sure that I'll come back with questions during the 
migration. By the way, I've recently added English support to the Android app 
so if you check out the project page and install it (for free) via the link 
pointing to the play store, you'll be able to search for contacts and make 
phone calls even offline via voice control having a bison parser running under 
the hood:)

Best regards,
r0ller
 
-------- Eredeti levél --------
Feladó: Akim Demaille < address@hidden (Link -> mailto:address@hidden) >
Dátum: 2018 november 20 20:19:50
Tárgy: Re: bison for nlp
Címzett: r0ller < address@hidden (Link -> mailto:address@hidden) >
 
Hi!
> Le 19 nov. 2018 à 16:09, r0ller <address@hidden> a écrit :
>
> Hi Akim,
>
> I managed to take the first step and get it running but it wasn’t as easy as 
> I thought.
Sorry about that :(
> First, I wanted to take the approach that the 'Simple C++ Example' 
> demonstrates in the bison manual. However, I could not figure out what my 
> yylex() should return when defining api.value.type variant.
What exactly was not clear on this? I don’t recommend not using 
api.token.constructor, but if you don’t see this bit of the documentation:
https://www.gnu.org/software/bison/manual/bison.html#Split-Symbols
The example is:
[0-9]+ {<!-- -->
yylval->emplace (text_to_int (yytext));
return yy::parser::token::INTEGER;
}
[a-z]+ {<!-- -->
yylval->emplace (yytext);
return yy::parser::token::IDENTIFIER;
}
> The question is, in case of having many tokens like I do, how do I decide 
> which shall be returned, as bison now generates a symbol_type make_TOKEN() 
> for each token which I shall be able to return in yylex().
I’m not sure I understand the question. The signature of yylex is completely 
different with and without api.token.constructor. So it’s all or nothing, you 
can’t expect to be incremental on this aspect.
I certainly agree this is a painful migration, but I do believe you’ll find it 
is worth it.
> Though, I'd rather not put a huge switch() in yylex(). Is there any other 
> solution like defining a "dummy" token like
>
> %token <int> INT;
>
> whose constructor make_INT(const& int) would simply return the int passed to 
> it? Or shall I simply try to cast the integers of my tokens to symbol_type?
Once you decided to move to api.token.constructor, use make_FOO and only 
make_FOO.
> The other problem I ran into was related to the non-terminals: wherever I 
> wanted to read the value of a symbol in an action via e.g. $1, I got an error 
> about type conversion as it could not be converted any more to an integer as 
> in the C parser.
Sorry, I don’t understand what you mean. If you declared the nterm to be an 
int, then $1 is an int.
> For this I have only one guess namely, that each non-terminal needs a %type 
> declaration like
>
> %type <int> ENG_Con;
>
> and even the = operator needs to be defined for it, right? So here I got 
> stuck at least with regards to api.value.type variant.
I would need more details (read: code) to understand your problem.
> Then I decided to take a step back and not to use complete symbols but split 
> symbols for a first try. This I managed to figure out and make it work but 
> with a small hack as I declared yylex as:
>
> int yylex(int* yylval);
>
> If I did it like:
>
> int yylex(semantic_type* yylval);
>
> the compiler kept complaining about not knowing semantic_type (nor 
> parser::semantic_type, nor yy::parser::semantic_type). So I took clang’s hint 
> when it said semantic_type* is aka int* and it worked.
Woot? I doubt that you are only using ints as semantic value???
If the compiler did not know about yy::parser::semantic_type, maybe it’s 
because you used %{…%} instead of %code requires {…}. Please, be careful when 
imitating simple.yy and the others.
> In the end, to make my hack a bit more nicer, I added %define api.value.type 
> {int}.
So you really just have ints? Then, yes, variants are overkill. Unless you 
already know that at some point you will have more than ints.
 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]