bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Enhancement request: enabling Variant in C parsers


From: Akim Demaille
Subject: Re: Enhancement request: enabling Variant in C parsers
Date: Sun, 19 Aug 2018 09:27:41 +0200

Hi Victor,

Please, keep the CC to bug-bison.

> Le 18 août 2018 à 22:52, Victor Khomenko <address@hidden> a écrit :
> 
>>> %code requires
>>> {
>>>     struct my_value {
>>>             enum{...} kind;
>>>             union{...} u;
>>>     };
>>> }
>>> %define api.value.type {struct my_value}
>>> %token <u.ival> INT "integer"
>>> %token <u.sval> STR « string"
>> 
>> I’m not sure I understand how this will really help you putting object in the
>> stack: (non trivial) objects cannot into union.  I must be missing your 
>> point.
> 
> Since C++11 one can put non-POD types into the union, see 
> https://en.cppreference.com/w/cpp/language/union

It does simplify the implementation, you are right, compared to what we did for 
C++98.

> However, Variants are much better. My plan was to make something like:
> 
> %define api.value.type {my_variant<int,string>}
> %token <get<int>()> INT "integer"
> %token <get<string>()> STR "string"
> 
> Here my_variant is a type based on std::variant, see
> https://en.cppreference.com/w/cpp/utility/variant
> The tricky bit is that the values returned by get<T>() can be both on the 
> l.h.s and r.h.s. of assignment. This can be handled by returning a proxy 
> object with an overridden operator=. 



>> And really, I would like to understand what makes you think it is 
>> advantageous
>> to develop complex storage types for C++ with the C skeletons, rather than
>> using the C++ skeleton.
> 
> 1. I think enabling variant option for C parsers would make it really simple 
> for users (not sure about how much development effort is required - hopefully 
> not much given that bison already has variants). 

Bison’s variants are for C++, I don’t think porting this to C would be trivial. 
 They are made to live in a C++ container, not a C array.  Existing variants 
show a path though, granted.

Please, do note that I do not plan to accept variants for the C skeletons in 
Bison per se (unless it turns out to be a straightforward implementation 
reusing most of what is already in there).  As a maintainer, I’m also in charge 
of making sure we can keep the boat floating for the years to come.  We already 
have quite a high matrix of possibilities to check in the test suite.  We 
already have to maintain features that were submitted by contributors who are 
no longer here today for the maintenance.


> 2. I think the cleanest interface for simple parsers is a function call (I 
> use re-entrant C parsers). Then one does not have to fight with mutual 
> #includes, and can simply declare the parser function without any #includes. 
> I realise this is a matter of preference, though.

It seems to me that you are comparing

    yyparse();

with

    yy::parser p;
    p.yyparse();


I agree some of the choices made for the header file in the C++ skeletons were 
not ideal, but we’re improving, and feedback will help improving further.

> 3. I have several old C-style parsers I have to maintain. Could be rewritten 
> as C++, but that would require some effort.

I agree this is an issue.  You are the one who best knows whether in the 
long/medium term it is better to invest time into the migration to C++ 
skeletons just enjoying the ride (variants, etc.), or follow the other path.

I would strongly suggest that you look at the examples/ in Bison, the C++ 
calculator and (attached below) the variant.yy example.  And see if it’s really 
that hard to migrate from C to C++ skeletons.  I did have a migrate a parser of 
mine in C++ from yacc.c to lalr1.cc.  It was non trivial, but it was quite at 
the birth of lalr1.cc, so now I think we are more mature on this, and we can 
guide you.


>>> * there is a risk that errors will not be reported correctly, or will be 
>>> reported
>> in a wrong order; e.g. the type error could be due to some missing « )", so 
>> the
>> parser will happily plough through it and report a syntax error 50 lines 
>> later in
>> some innocent fragment of code, and the AST is never completed.
>> 
>> I do not understand what you mean here.  If there’s a parse error and you
>> invested in error recovery, the parser _must_ build a valid AST.  So the 
>> error
>> recovery rules (using the error token) must create a node.  Then you have the
>> choice of using an existing node from your ast, say a dummy int, which can
>> later produce spurious typing errors, granted.  But on this regard, I don’t 
>> see
>> how typing-as-I-parse helps.  Also, a better strategy in to introduce a new 
>> AST
>> node to represent precisely that there was an error here, so that the type
>> checker that keep silent when checking it.
> 
> Ok, this should work. However, my error recovery is rather basic (report the 
> first error and stop). Also, if I implement all this, I doubt I will shorten 
> the code, as I'll need another routine to traverse the AST and check types. 
> So it sounds like a strategy for elaborated parsers with advanced error 
> recovery.

Again, you are the one who can evaluate that balance, I just meant to report 
what bells your code rings in me.  And note that ‘less code’ is definitely a 
relevant scale, but simplicity is too.


>> Still, type checking typically requires resolving names, binding from uses to
>> definitions, i.e., deal with context sensitive matters.
> 
> In my case, I use a usual variable table and push/pop_context when processing 
> a scope, this is sufficient as I have basic error handling and don't use glr. 
> Generally, I'm happy with the parser code, with the exception of a large 
> number of "delete"s. This can be fixed by moving to C++ parser or supplying a 
> custom my_variant type, but I think enabling variant for C parsers would be 
> welcome by many bison users - I believe it’s quite common nowadays to use C++ 
> compilers for C programmes.

I tend to think it is, on the contrary, less common today than it was before.

> Regards,
> Victor.

Cheers!

        Akim

Attachment: variant.yy
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]