gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Portability Considerations


From: Benjamin Kowarsch
Subject: Re: Portability Considerations
Date: Mon, 18 Mar 2024 16:30:06 +0900

Hi

On Mon, 18 Mar 2024 at 14:38, Alice Osako wrote:
Following Benjamin Kowarsch's advice, I am looking into what it would
take to write a suitable API definition for basic bitwise operations,
and even in the earliest stages I am already running into a significant
problem: the lack of defined sizes for primitive values.

While the Modula-2 language design seems predicated on the assumption
that most operations will be on specified sub-sets of the primitive
types - a wise but tricky design choice, especially in a systems
language where the system word size is often of paramount importance -
this assumption only makes sense if the primitive base type is as wide
as possible, preferably even multi-word/BigNum widths if the language
designer could manage it.

This isn't the case with Modula-2, where there is no mandate for even
minimum primitive widths as far as I can tell. Is it really safe to
assume that a CARDINAL will be at least 32-bits wide? Perhaps on a
modern compiler, but is it really 'portable' if legacy systems are
ignored? This is exactly the sort of thing that breaks portability.

I have looked into precisely these and other, related issues when I worked on a Modula-2 hosted bootstrap compiler for our Modula-2 revision. Although I have since moved priority and focus back to the C hosted version after I managed to work out a straightforward M2-to-C name translation scheme, the M2 hosted sub-project remains on github and there is plenty of stuff you can just reuse.

As for the bitwidths of CARDINAL, INTEGER and LONGINT, the following always applies:

(1) CARDINAL always has the same bitwidth as INTEGER.
(2) There are exactly five possibilities for bitwidths of CARDINAL/INTEGER and LONGINT

(i) 16bit/16bit
(ii) 16bit/32bit
(iii) 32bit/32bit
(iv) 32bit/64bit
(v) 64bit/64bit

These can be easily tested for and a library configured accordingly during the build process.

https://github.com/m2sf/m2pp/blob/master/cfg/config.samples.txt

Practically, you can cover all scenarios of composite types by providing three libraries, one generic library that composes the type from 16-bit cardinals and can be used as a fallback case, one that uses 32-bit and another that uses 64-bit components, to be chosen according to the memory model of the compiler or target architecture.

An example for this is my token set library for sets larger than the 16 or 32 bits supported by the BITSET type.

https://github.com/m2sf/m2bsk/blob/master/src/TokenSet.16bit.def
https://github.com/m2sf/m2bsk/blob/master/src/imp/TokenSet.16bit.mod

https://github.com/m2sf/m2bsk/blob/master/src/TokenSet.32bit.def
https://github.com/m2sf/m2bsk/blob/master/src/imp/TokenSet.32bit.mod

https://github.com/m2sf/m2bsk/blob/master/src/TokenSet.32bit.def
https://github.com/m2sf/m2bsk/blob/master/src/imp/TokenSet.32bit.mod

There are some compromises to be made since neither PIM nor ISO support variadic arguments like R10 does:

(* universal M2R10 version *)
PROCEDURE newFromRawData ( bitlist : ARGLIST OF TokenSetRange ) : TokenSet;

and therefore each version needs its own interface instead of having a universal interface for all versions.

(* 16 bit PIM/ISO version *)
PROCEDURE NewFromRawData
  ( VAR set : TokenSet;
    segment5, segment4, segment3, segment2, segment1, segment0 : CARDINAL );

(* 32 bit PIM/ISO version *)
PROCEDURE NewFromRawData
  ( VAR set : TokenSet; segment2, segment1, segment0 : CARDINAL );

(* 64 bit PIM/ISO version *)
PROCEDURE NewFromRawData
  ( VAR set : TokenSet; segment1, segment0 : CARDINAL );

But this is the kind of price to pay when using a 45-year old language that hasn't been updated in 30 years.

For the library to be portable, it has to either define separate byte,
16-bit, 32-bit, and 64-bit procedures for each operation (and hope that
the last of those is supported at all), or define a separate type
specifically for bitwise operations with a fixed size across all
possible platforms (i.e., PACKEDSET/BITSET, but without any native
support or operators).

When working with bitsets you generally want the benefits of a more efficient implementation. The best performance and storage efficiency is obtained by implementing three different versions. And the bitwidth of BITSET is implementation defined and on some compilers/targets that means 16-bit, which would then be the smallest common denominator and far too limiting for many use cases.

The use of infix operators with sets isn't all that great anyway since the only operators supported are not intuitive because they do not match the mathematical operators.

 
Generics would alleviate this problem, except that generics are
themselves not available in PIM, and are a non-standard extension even
in ISO.

Generics can be done with PIM (and ISO) using our M2 template engine.

https://github.com/m2sf/m2pp

It is a simple placeholder expansion tool with a single built-in macro that generates a CAST either following PIM or ISO syntax, depending on the desired target output. Casting is about the only operation that is unavoidable and its syntax is incompatible between the two dialects.

I noticed that the GCC base libraries include BitByteOps and BitWordOps
libraries, but these are not merely non-portable, they are (as far as I
can tell) undocumented. Even if this weren't the case, I would rather
not have to imitate this design choice - separate, parallel libraries
for different word sizes - if I can avoid it.

I strongly recommend not to put too much stuff into a single library module. Within the aforementioned M2 hosted bootstrap compiler sub-project I have separated math and bit operations, and then implemented one set each for CARDINAL, INTEGER and LONGINT.

https://github.com/m2sf/m2bsk/blob/master/src/lib/CardMath.def
https://github.com/m2sf/m2bsk/blob/master/src/lib/CardBitOps.def

https://github.com/m2sf/m2bsk/blob/master/src/lib/IntMath.def
https://github.com/m2sf/m2bsk/blob/master/src/lib/IntBitOps.def

https://github.com/m2sf/m2bsk/blob/master/src/lib/LongIntMath.def
https://github.com/m2sf/m2bsk/blob/master/src/lib/LongIntBitOps.def

You may just want to reuse them. If you feel you need more operations, you can always add some.

While it is only a bare sketch of an API at this point, what I have for
now is something along the lines of:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

DEFINITION MODULE Bitwise;

EXPORT QUALIFIED Test, Set,
                  ShiftLeft, ShiftRight, ArithShiftRight,
                  RotateLeft, RotateRight,
                  Not, And, Or, XOr, Impl, NAnd, NOr;

EXPORT is only needed within local modules, that is, modules that are nested within other modules (ugly sh1t).

At the top level, everything within an interface module (.def) is automatically exported.

Also, since you are likely to have expressions with multiple bit operations, you want to keep the names of the functions as short as possible to avoid clutter. Across languages of the Algol/Pascal family, you will most frequently see the following names:

SHL, SHR, ASHR, ROTL, ROTR, BWNOT, BWAND, BWOR etc

In our full spec we also defined shifts and rotations through a carry bit which can often be mapped directly onto machine instructions by a compiler back end and it is very useful when processing data larger than what fits into a single register.

SHLC, SHRC, ASHRC, ROTLC and ROTRC.

 
PROCEDURE Test(value: CARDINAL; index: CARDINAL): BOOLEAN;
(* Test - Test whether a given bit in a value is set. *)

It is generally better design to use specific names and for functions it should ideally be a noun

PROCEDURE BIT ( value : T1; bitIndex : T2 ) : BOOLEAN;
PROCEDURE SETBIT ( VAR target : T1; bitIndex : T2; bit : BOOLEAN );

Also, if you stick to the interfaces defined in

https://github.com/m2sf/m2bsk/wiki/Language-Specification-(11)-:-Low-Level-Facilities

your code will later be much easier to port to M2R10 once we have a working compiler.

Anyway, browse the M2BSK and M2PP repos and see if you can reuse what's already there.

hope this helps
regards
benjamin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]