help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Linux/flex-generated lex.yy.c compiled on Windows: truncated input


From: Kurt Bischoff
Subject: Linux/flex-generated lex.yy.c compiled on Windows: truncated input
Date: Thu, 09 Mar 2006 20:57:34 -0600
User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)

This is a minimalized case to demonstrate an apparent bug in a flex-generated scanner when run under Windows and Visual C++, hence an apparent bug in flex.

The scanner is to consume its input six characters at a time until there are less than six characters remaining on the input. Then it is to consume the remainding input one character at a time. It is to report its current position in the input just before each return and after consuming all the input.

Behavior is as expected when both flex and the flex-generated scanner are run on my Linux Fedora 4 system. The seemingly wrong behavior is observed when a Linux/flex-generated lex.yy.c is compiled and run on Windows XP with Microsoft Platform SDK and Visual C++ 2005 Express Edition.

On Windows the program sometimes does not report consuming all its input. In those cases it behaves as if the input is much shorter. The problem is not observed when the input contains only printable characters.

For test data, you can use any file. You could for instance give the scanner its own source code or its own binary code as input. The easiest way to verify the problem may be to scan a binary file under Linux, then copy the same file onto Windows and scan it using the Windows-compiled scanner.

Please contact me with any questions about this apparent bug in flex. Code follows.
address@hidden

======================================begin flex input================================
%{
#include <cstdlib>
#include <cstdio>
#include <cassert>
#include <iostream>
#include <limits>
using namespace std;

extern void extrachar(void);
extern void matchChores();
extern void reportPosition(ostream *s);

long scannerLineNumber = 1;
long charCount = 0;
long positionInLine = 1;

#define YY_NEVER_INTERACTIVE 1
%}

%%
(.|\n){6} { matchChores(); return 1;      }
(.|\n)    { matchChores(); extrachar(); }
%%

bool lexerInit(FILE *infile){
  assert(sizeof(yy_size_t) == sizeof(size_t));
  assert(sizeof(int) == sizeof(size_t));
  yyin = infile;
  return true;
}

int yywrap() {
cerr << "in yywrap" << endl;
  return 1;
}

void extrachar(void){
  fprintf(stderr,"line %ld: extra character:\n",scannerLineNumber);
  fprintf(stderr,"hex code: %x\n",*yytext);
  // fprintf(stderr,"as character: %c\n",*yytext);
}

void matchChores() {
  charCount += yyleng;
  for(long i=0; i<yyleng; i++) {
     if (yytext[i] == '\n') {
        scannerLineNumber++;
        positionInLine = 1;
     } else {
        positionInLine++;
     }
  }
}

void reportPosition(ostream *s) {
  *s << "line number: " << scannerLineNumber << ".  "
     << "position in line: " << positionInLine << ".  "
     << "total characters: " << charCount << ".  "
     << endl;
}

FILE *safeFileOpen(const char *name, const char *mode) {
  FILE *f = fopen(name,mode);
  if (f == (FILE *)NULL){
     fprintf(stderr, "error opening %s: ", name);
     perror("");
     exit(1);
  }
  return f;
}

int main(int argc, char *argv[]) {
  if ((argc < 2) || (strcmp(argv[1],"-"))) {
     assert(lexerInit(safeFileOpen("t1","r")));
  }
  int tokencount = 0;
  int val;
  while((val = yylex())) {
     fprintf(stderr,"val == %i,  %i tokens\n", val, tokencount);
     tokencount++;
     reportPosition(&cerr);
  }
  reportPosition(&cerr);
  return 0;
}

===============================Begin Makefile=====================================
CPP=g++
CPPOPTS=-Wno-deprecated -Wall
CPPOPTS2=-c
OBJ=o
EXE=
LEX=flex

OBJFILES=lex.yy.$(OBJ)

OBJLIBS=

scantest$(EXE): $(OBJFILES)
       $(CPP) -o scantest$(EXE) $(CPPOPTS) $(OBJFILES) $(OBJLIBS)

lex.yy.cc: scan.l
       $(LEX) -L scan.l;\
       sed -e "/unistd/d" < lex.yy.c >lex.yy.cc; rm lex.yy.c

lex.yy.$(OBJ): lex.yy.cc
       $(CPP) $(CPPOPTS2) lex.yy.cc

clean:
       rm *.$(OBJ) lex.yy.cc scantest






reply via email to

[Prev in Thread] Current Thread [Next in Thread]