bug-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Flex size_t sizes


From: Kaz Kylheku
Subject: Re: Flex size_t sizes
Date: Fri, 12 Nov 2021 14:41:40 -0800
User-agent: Roundcube Webmail/0.9.2

On 2021-10-14 06:23, Hans Åberg wrote:
Hi Akim,

Saw you have edited Flex, so I take it up here, even though not
strictly a Bison topic:

The Apple flex version has been edited to admit size_t sizes, 64-bit
on the platform, and perhaps it might be good idea for regular flex,
which uses int, only 32-bit there. If using '%option c++' and mixing
the versions, then the FlexLexer.h header is incompatible with the C++
source code, generating a compile error.

On MacOS with MacPorts and Xcode, the files are in
/opt/local/include/FlexLexer.h
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/FlexLexer.h

Consider shipping the output of Flex as part of your project source code,
so that downstream users do not have to have Flex installed.

I was not previously aware of this FlexLexer.h (TIL -- "today I learned").

This is because I would never think of putting Flex together with C++.

I see that, for instance on a Ubuntu 18 system here, there is a
/usr/bin/FlexLexer.h file.

This is an incredibly, incredibly bad idea.

Generated parser or scanner code must be 100% self-contained. The Flex-generated
code cannot be depending on some /usr/include/FlexLexer.h.

Never mind it being the wrong version; what if it doesn't exist?

If there must be a FlexLexer.h, the thing to do is to arrange for the build process to use a local copy of FlexLexer.h that is in your tree. As part of generating the scanner, your Makefile (or whatever) steps should hunt down this header file, stick it into your tree, and make the code refer to that
copy.

Check that copy into version control, and make sure downstream users have it
as part of the distribution, and that they can build the scanner without
having any portion of Flex installed on their system.

--

I just tried compiling a tiny Lex example using the flex -c++ option.

I see that the generated lex.yy.cc file has this line:

  #include <FlexLexer.h>

this is an incredibly poor decision; it is telling the compiler to search in the system places for the header file. Unless you influence your compiler's
search path, it will not find the local file "FlexLexer.h".

If this was

  #include "FlexLexer.h"

it will still fall back on the <FlexLexer.h> search if "FlexLexer.h" fails,
and has the advantage that if you just copy that header file into the
same directory and do nothing else the problem is almost certainly solved.

This really smells like someone disengaged their full brain when they
designed this C++ mode of Flex.

If you don't want to make your compiler do silly things with header file
searching, that leaves you with editing the output of Flex to change that
#include to angle brackets.

Another mistake here is not putting an include guard! If the generated
scanner did this:

  #ifndef __FLEX_LEXER_H
  #include <FlexLexer.h>
  #endif

that would also likely be okay, because you could then include your copy
first. Ah, but no you can't! Here is another problem: the above #include
is asserted before all your own includes. So if you have

  %{
  #include "FlexLexer.h"
  %}

the generated file does #include <FlexLexer.h> first, long before
your generated #include "FlexLexer.h".

However, if the Flex output had the include guard above, then even with
the include being in the wrong place (too early) you could still do:

  # suppress the #include <FlexLexer.h> by defining the
  $(CC) ... -D__FLEX_LEXER_H ... lex.yy.cc -o ...

and then in the source file:

  %{
  #undef __FLEX_LEXER_H  # force processing of header
  #include "FlexLexer.h"
  %}

Ah, but no ... turns out that __FLEX_LEXER_H is not a complete
header file guard, in spite of the _H suffix and what it means
in the C and C++ programmer culture. The #endif for it is
in the middle of the file!

Just, yikes.

Anyway,

I can confirm that the following builds for me on Ubuntu 18, if I
take lex.yy.cc and manually remove the #include <FlexLexer.h>
line. So that is to say, my own #include "FlexLexer.h"
works just fine in the position where it is; all the
references to the FlexLexer C++ class come later:

%{
#include "FlexLexer.h"
%}

%%
username        printf("%s\n", getenv("USER"));
%%

main()
{
  yyFlexLexer fl;

  fl.yylex();
}

// grr: undefined!
int yyFlexLexer::yywrap() { return 0; }


If you don't remove #include <FlexLexer.h> from the output,
of course you get double definitions of some macros.

With this skeleton, you can plant a copy of the FlexLexer.h
into your tree.  That could be a manual step, or some
kind of halfway intelligent script which is done only when
the person building the code is a maintainer (because if they
are a downstream user, then they are using the shipped Flex scanner,
and not required to have Flex installed).








reply via email to

[Prev in Thread] Current Thread [Next in Thread]