help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bison for nlp


From: Hans Åberg
Subject: Re: bison for nlp
Date: Fri, 9 Nov 2018 14:45:15 +0100

> On 9 Nov 2018, at 12:11, Akim Demaille <address@hidden> wrote:
> 
>> Le 9 nov. 2018 à 09:58, Hans Åberg <address@hidden> a écrit :
>> 
>> 
>>> On 9 Nov 2018, at 05:59, Akim Demaille <address@hidden> wrote:
>>> 
>>>> By the way, I’ll still get the error message as a string I guess, right?
>>> 
>>> Yes.  Some day we will work on improving error message generation,
>>> there is much demand.
>> 
>> One thing I’d like to have is if there is an error with say a identifier, 
>> also writing the out the name of it.
> 
> Yes, that’s a common desire.  However, I don’t think it’s really
> what people need, because the way you print the semantic value
> might differ from what you actually wrote.  For instance, if I have
> a syntax error involving an integer literal written in binary,
> say 0b101010, then I will be surprised to read that I have an error
> involving 42.
> 
> So you would need to cary the exact string from the scanner to the
> parser, and I think that’s too much to ask for.    

That is what I do. So I merely want an extra argument in the error reporting 
function where it can be put.

> Not to mention the
> case of super-long tokens, say a large string, or an ugly regex,
> cluttering the error message.

Have you ever seen a C++ error message? :-)

> I believe that the right approach is rather the one we have in compilers
> and in bison: caret errors.
> 
> $ cat /tmp/foo.y
> %token FOO 0xff 0xff
> %%
> exp:;
> $ LC_ALL=C bison /tmp/foo.y
> /tmp/foo.y:1.17-20: error: syntax error, unexpected integer
> %token FOO 0xff 0xff
>                 ^^^^
> I would have been bothered by « unexpected 255 ».

Currently, that's for those still using only ASCII. I am using Unicode 
characters and LC_CTYPE=UTF-8, so it will not display properly. In fact, I am 
using special code to even write out Unicode characters in the error strings, 
since Bison assumes all strings are ASCII, the bytes with the high bit set 
being translated into escape sequences.

Maybe the byte counts can be usable if there is some tool to display them.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]