[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bison lexer
From: |
Hans Åberg |
Subject: |
Re: Bison lexer |
Date: |
Fri, 31 Aug 2018 23:39:37 +0200 |
> On 31 Aug 2018, at 22:26, Frank Heckenbach <address@hidden> wrote:
>
> Hans Åberg wrote:
>
>>> For a start, I didn't have very good experience communicating with
>>> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
>>> etc. in the generated code, so over the years I'd been adjusting
>>> various warning-suppression gcc options or doing dirty #define
>>> tricks to avoid warnings, or sometimes even post-processing the
>>> generated lexer with sed.
>>
>> GCC 8.2 uses C17 as default.
>
> I haven't used gcc-8 yet, but how is this relevant? If anything, I
> expect newer gcc versions to produce more warnings (usually useful)
> which flex might also suffer from.
Maybe the Flex lexers errors is due to using C89 to compile it or something.
>>> But the final straw was when, after changing to C++ Bison, I wanted
>>> to switch to C++ Flex too and found this beautiful comment:
>>>
>>> /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
>>> * following macro. This is required in order to pass the
>>> c++-multiple-scanners
>>> * test in the regression suite. We get reports that it breaks
>>> inheritance.
>>> * We will address this in a future release of flex, or omit the C++
>>> scanner
>>> * altogether. */
>>
>> It has been like that since the 1990s, I believe.
>
> Even better! :(
>
> Especially since C++ in the 1990s was totally different from modern
> C++, so I have no idea if anything of this comment is still
> relevant, or maybe even more relevant, today compared to then.
Indeed, very old.
> Lesson (as if anyone was listening): Always put a date on such
> messages.
Probably just a hack, never actually developed.
>>> So I wrote a small library that builds that massive RE out of single
>>> rules and maps subexpressions back to rules (even in the case that
>>> rules contain subexpressions of their own), and that works for me.
>>
>> I did that, too: I wrote some DFA/NFA code, and incidentally found
>> the most efficient method make action matches via a reverse NFA
>> lookup, cf. [1-3]. Also, I have made UTF-8/32 to octet character
>> class translations.
>>
>> 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
>> 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
>> 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
>
> Interesting, thanks. Fortunately, my REs are not so complex, so the
> bug you reported won't affect me and lexing speed is not so
> important for me, so (at least for now) I can just use the library
> as is. But if I ever need something more sophisticated, I'll keep
> this in mind.
If that is what you are using, note that it is recursive, so the function stack
might overflow. But perhaps the rewrite it someday.
- Re: Bison C++ mid-rule value lost with variants, Akim Demaille, 2018/08/12
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/26
- Re: Bison C++ mid-rule value lost with variants, Akim Demaille, 2018/08/27
- Re: Bison C++ mid-rule value lost with variants, Hans Åberg, 2018/08/27
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/28
- Bison lexer, Hans Åberg, 2018/08/29
- Re: Bison lexer, Frank Heckenbach, 2018/08/31
- Re: Bison lexer,
Hans Åberg <=
- Re: Bison lexer, Frank Heckenbach, 2018/08/31
- Re: Bison lexer, Hans Åberg, 2018/08/31
- Re: Bison C++ mid-rule value lost with variants, Frank Heckenbach, 2018/08/28