[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCHES] example calc++
From: |
Jannick |
Subject: |
RE: [PATCHES] example calc++ |
Date: |
Mon, 26 Nov 2018 15:31:41 +0100 |
Hi Akim,
On Wed, 21 Nov 2018 17:57:10 +0100, Akim Demaille wrote:
> Please, stick to our style for the commit messages. Look at a few examples,
> and imitate.
I will try to do my very best next time. :)
> I installed the first as follows.
>
> commit 67d77493fd16de3472c6f09b59f18946b83dfbe0
> Author: Jannick <address@hidden>
> Date: Tue Nov 20 22:00:43 2018 +0100
>
> doc: calc++: ignore \r in the scaner
(minor: a typo.)
> * doc/bison.texi (Calc++ Scanner): Ignore \r.
>
> diff --git a/doc/bison.texi b/doc/bison.texi index 01d7c8dc..371cec84 100644
> --- a/doc/bison.texi
> +++ b/doc/bison.texi
> @@ -11897,7 +11897,7 @@ Abbreviations allow for more readable rules.
> @example
> id [a-zA-Z][a-zA-Z_0-9]*
> int [0-9]+
> -blank [ \t]
> +blank [ \t\r]
> @end example
>
> @noindent
This code snip appears not only in the sample code files, but in bison's
documentation, too. So, for didactic reasons, I would be hesitating to put
'\r' into a regexp named 'blank', albeit ultimately no difference to the
scanner, of course. The most natural way I wanted to handle this without
changing to much would have been to define a regexp dealing with any kind of
line break, however, that collides with the line counting
[\n]+ loc.lines (yyleng);
This was the reason I put '\r' into a separate line to simply ignore them.
Thinking about that a bit more and generalizing this for probably almost all
operating systems out there, I believe /(\r\n|\n\r|\n|\r)/ would be a good
pattern for a line break to start off with and the line counter must be
adjusted to. Suggestion (using flex'es greediness):
((\r\n)+|(\n\r)+) loc.lines (yyleng/2);
(\n+|\r+) loc.lines (yyleng);
This would work even for mixed EOL files (with both '\n' and '\r\n') on my
Windows machine, where I have never seen a file with both '\n' and '\r' as
single EOL character. Not tested.
> I think the third one is wrong. yylex_destroy is for reentrant scanners, it’s
> releasing what yylex_init acquired.
'yylex_destroy' exists for both, reentrant and non-reentrant scanners, and
calling it after having used the scanner is required to avoid memory leakage in
both cases. See below.
> We don’t use a reentrant scanner, and we don’t call yylex_init, so I don’t
> think we should call yylex_destroy.=
This is a probably not so well documented issue for __non-reentrant__ flex
scanners: flex allocates memory for the stack object 'yy_buffer_stack' every
buffer struct is pushed onto. For reentrant scanners the stack is a member of
the yyscanner object, for non-reentrant ones it is a global variable.
Regardless of the scanner type 'yyensure_buffer_stack' (re)allocates memory for
the stack object to make sure that it exists and the stack has sufficient size.
This happens, e.g., when 'yylex' is called the first time. If, for a
non-reentrant scanner, we do not call 'yylex_destroy' at the end of the
program, there is memory leakage. Same applies to reentrant scanners which is
more known.
So, still my suggestion for the non-reentrant scanner calc++ to install a patch
making sure that 'yylex_destroy' is called at the end.
BTW: I believe that the same issue applies to bison's three non-reentrant
scanners. If memory leakage is important to bsion, then probably a good idea to
add, say at the end of function 'reader()', something like
gram_lex_destroy();
code_lex_destroy();
and at the end of 'scan_skel()'
skel_lex_destroy();
Thanks,
J.