Re: A beginner's question

Hi Rudolf,

On Fri, 7 Apr 2023 at 23:48, Rudolf Schubert wrote:

Yes, I'm aware that the 'ISO-parts' in GPM and GM2 are different.
I could easily handle these differences, so my code can now be compiled on
both GPM and GM2

Still, I wouldn't rule out that the different behaviour may simply be down to incompatibility.

Note that it is a common phenomenon that bugs show up in software when it is ported even though it seemed to be working error free on the original target platform. This is one of the reasons why it is a good idea to write portable code and test it on at least two different compilers and/or targets, even if portability is not an actual requirement.

WriteString(cid, 15C+12C);

With GM2 compilation goes fine but when run I get:

RTExceptions.mod:648:35: In invalidloc
RTExceptions.mod:648:35:invalid address referenced

I did not dig too deep into this, perhaps the construct '15C+12C' here
is not quite 'legal'?

ISO permits string constant concatenation, but I am not sure whether character code literals are recognised as strings in that _expression_. It may well be recognised as an arithmetic _expression_ adding two character code values. I do remember that this was put forward as an argument against using the plus operator for string concatenation at the ISO WG13 meeting in Milton Keynes. The proposed alternative was ampersand, but computer scientists outnumbered mathematicians and so the plus operator was adopted. Unfortunately.

Maybe Gaius (the implementer of GM2) can shed some light on the intended behaviour of GM2 in this case.

But instead I now use this little helper and
everything is fine:

PROCEDURE WriteLnCrLf(cid: ChanId);
VAR
schnur2: ARRAY[0..1] OF CHAR;

BEGIN (* PROCEDURE WriteLnCrLf *)
schnur2:=' ';
schnur2[0]:=15C; (* CR *)
schnur2[1]:=12C; (* LF *)
WriteString(cid, schnur2);
END WriteLnCrLf;

I wouldn't hard code that into the code. It is not portable. Even if you think you don't need portability, as I mentioned above, it is always a good idea to write portable code and test it on different compilers/platforms. That's where you find bugs you won't otherwise uncover.

You might ask why not using WriteLn from TextIO instead? I found that
I do need CR and LF in my output and WriteLn produced either only
CR (on Linux) or CR and LF on Windoze.

It is LF on Unix and Unix-like systems (including Linux), CR on legacy Macs, and CRLF on OpenVMS, DOS and Windows.

So WriteLnCrLf was my quick
answer to the small problem.

Here is how I do this portably:

Module Newline defines a mode setting, which can be either CR, LF or CRLF.

https://github.com/m2sf/m2bsk/blob/master/src/lib/IO/Newline.def

Then you write your own WriteLn procedure that uses the mode setting to decide when to write CR, LF or CRLF

https://github.com/m2sf/m2pp/blob/master/src/imp/Outfile.mod#L230

Your code always only calls WriteLn. In order to change the newline mode of the program, all you need to do is call Newline.setMode().

But quickly: a (potentially) big input string 'schnur' should be transformed
into a linked list of smaller strings of length mittellen. Don't ask why I
used this method. I had many smaller strings but some very big ones and I
wanted to store them all in some array but did not want to use the biggest
possible string even for the small ones...)

You can store strings efficiently (in space and time) by allocating the exact length needed for the most frequently encountered lengths, and as they get longer by allocating a slightly larger length so they can be grouped with other strings of similar lengths (thereby reducing the number of cases).

My interned string library stores strings of lengths 1 to 80 in memory blocks exactly of that length, and strings of lengths 81 to 4096 in nine groups of increasing size. Nevertheless all these strings are of the same string type in the public interface.

https://github.com/m2sf/m2bsk/blob/master/src/lib/String.def

https://github.com/m2sf/m2bsk/blob/master/src/lib/imp/String.pim.mod

https://github.com/m2sf/m2bsk/blob/master/src/lib/imp/String.iso.mod

The library stores every string in a hash table, but only once. If a string is already stored, it will not be allocated again, but the previously checked in string is returned. As a nice side effect of this, strings can be compared with the equal operator (str1 = str2), because if two strings are equal, their addresses are the same.

4. Perhaps a problem which might be related to 3. was the following:

Again when reading from my (text) file I used ReadInt from WholeIO.
But again this showed very strange results. I then replaced ReadInt
with a sequence of

ReadToken
StrToInt

and the problems were gone! Doesn't this sound strange?

This sounds exactly like the kinds of issues you can expect when porting code. The ISO IO library in particular is a minefield. For our bootstrap compiler we support some 13 or 14 Modula-2 compilers as hosts, most of which are PIM but some are ISO compliant (or partially like GPM). To ensure all the code compiles with the ISO IO library was a nightmare. The most difficult host was p1 Modula-2 though, not GM2. But still, anything using the ISO IO library was a royal pain in the back. By contrast, our own simpler IO library can be easily hooked into the Unix system calls for IO and even though that involved foreign function interfacing to C which is different on every single compiler, it was a walk in the park compared to using the ISO IO library across compilers without having to fiddle with the code.

I can't comment on the other issues you raised, but maybe Gaius will take a look.

regards

benjamin

From:	Benjamin Kowarsch
Subject:	Re: A beginner's question
Date:	Sat, 8 Apr 2023 01:35:57 +0900