Re: Dynamic mutidimensional arrays

Andreas

ETH Zurich

Prof. em. Dr. Andreas Fischlin

IPCC Vice-Chair WGII

Systems Ecology - Institute of Biogeochemistry and Pollutant Dynamics

CHN E 24

Universitaetstrasse 16

8092 Zurich

SWITZERLAND

andreas.fischlin@env.ethz.ch

www.sysecol.ethz.ch/people/andreas.fischlin.html

+41 44 633-6090 phone

+41 44 633-1136 fax

+41 79 595-4050 mobile

Make it as simple as possible, but distrust it!

________________________________________________________________________

On Tue, 04.04.23, at 21:43, Michael Riedl <udo-michael.riedl@t-online.de> wrote:

Benjamin,
actually I do not see your approach to be "easy" at all.
So please can you line out how the short sample I lined out should look like ?
Gruß
Michael

Am 04.04.23 um 18:46 schrieb Benjamin Kowarsch:

On Tue, 4 Apr 2023 at 06:34, Michael Riedl wrote:

in a lot of cases within linear algebra routines one has a need to have intermediate multidimensional arrays (mostly 2-dimensional, but I also had more dimensions) which would only be needed within the scope of the procedure.

A simple sample might be that we need to multiply two matrices and, in the end, one is only interrested in some special aspects of this result (e.g. the lowest eigenvalues).
How would you handle such a case ?
As said, in Oberon it would work as suggested, but also in modern Fortran that is not implement too different (sample code below), gfortran can compile the code without issues.
But I just wanted to see if that would be of principal interrest - if it would reqite too much work, I can live with static allocation for the moment.
Mainly it would be a chance to make code more readable and to avoide wasting too much memory in the case you define the maximun size of a problem the code can handle with static declarations.

In this day and age, the problem is not a shortage of hardware resources, be it CPU or RAM.

And implementation effort may not be the primary issue either.

The problem is that everybody has some pet peeve that they want to have a feature for in their preferred language. And this then leads to feature creep. It is a Pandorra's box. Once you add a specific feature for one thing, why would you then deny another specific feature to be added for some other thing. Before you know it, you have an egg laying wool milk sow.

Since around the time of the ISO C90 standard, C has had the ability to declare a record (called a struct) with an array field whose size is omitted (prior to standardisation) or zero (after standardisation). In other words, the record contains an array field whose size is indeterminate at compile time.
An instance of such a record/struct can then be allocated for any desired size at runtime. This facility can thereby be used to implement dynamically allocatable arrays whose size is only determined at runtime. In C terminology this is also known as "variable length array" or short VLA.

typedef struct vla_t {

int len;

foo buffer[0];

};

Unfortunately, the facility is NOT type safe, but that should not surprise anyone. After all, it is C. Type safety be damned. However, there is no reason why such a facility could not be designed in a type safe manner. It is indeed possible to make this type safe. The C folks simply didn't bother.

For our revision, we have looked into this and designed a type safe variant for Modula-2. We call the resulting facility indeterminate types.

It can be used to implement dynamically allocatable arrays of variable dimension and size.

In the public interface, we define an opaque pointer type:

TYPE Matrix;

Although, our syntax for opaque types is more explicit, we write:

TYPE Matrix = OPAQUE;

In the implementation module, we can then declare the type, like so:

TYPE Matrix = POINTER TO RECORD

rows,

columns : CARDINAL

+ data : ARRAY capacity OF REAL

END; (* Matrix *)

Our grammar permits this syntax only within implementation modules so that indeterminate types must always be opaque pointer types. Therefore, instantiation, reading and writing instances of indeterminate types can only be done through constructors, mutators and accessors provided in the public interface. The compiler knows that the capacity field within the array field declaration holds the size of the array field and it can therefore insert boundary checks into the generated code (unless such is disabled by compiler switch as some compilers permit).

We can then implement a constructor procedure, like so:

PROCEDURE NewMatrix ( VAR m : Matrix; rows, columns : CARDINAL; );

VAR capacity : LONGCARD;

BEGIN

capacity := rows*columns;
Storage.Allocate(m, TSIZE(Matrix, capacity)); m^.capacity := capacity

END NewMatrix;

... a mutator procedure, like so:

PROCEDURE storeValue ( VAR m : Matrix; row, column : CARDINAL; value : REAL );

BEGIN

(* pre-condition checking omitted for brevity *)

m^.data[row*m^.columns+column] := value

END storeValue;

... an accessor, like so:

PROCEDURE value ( VAR m : Matrix; row, column : CARDINAL ) : REAL;

(* pre-condition checking omitted for brevity *)

RETURN m^.data[row*m^.columns+column]

END value;

... and a destructor, like so:

PROCEDURE release ( VAR m : Matrix );

BEGIN

(* pre-condition checking omitted for brevity *)

Storage.Deallocate(m);

m := NIL

END release;

This is a very simple but also very general and flexible facility. It can be used to implement any kind of dynamically allocatable storage container of variable size. As such it is preferable to any specific facility, for every specific facility added to a language invites further specific facilities to be added later, leading to feature creep.

In our revision, we have a general facility called syntax binding that allows the binding of library procedures to built-in syntax. The above facility in combination with syntax binding can then be used to implement dynamic storage containers that look as if they were built-in. For example, the constructor can be bound to the built-in NEW syntax, mutator and accessor to L- and R-value subscript notation and the destructor to the built-in RELEASE syntax. When reference counting is desired, a retain procedure can be added and bound to the built-in RETAIN syntax.

So, yet again, I advise against adding any specialist facilities. If you feel you must add something, make it general, not specific.

As always, feel free to borrow from our work.

https://github.com/m2sf/m2bsk/wiki/Language-Specification-(4)-:-Types#records-with-an-indeterminate-field

regards

benjamin

From:	Fischlin Andreas
Subject:	Re: Dynamic mutidimensional arrays
Date:	Wed, 5 Apr 2023 06:48:27 +0000