gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Translating Modula-2 identifiers to C


From: Gaius Mulley
Subject: Re: Translating Modula-2 identifiers to C
Date: Wed, 10 May 2023 17:05:13 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Benjamin Kowarsch <trijezdci@gmail.com> writes:

> Except for a few loose ends, I have now completed a library to translate 
> Modula-2 identifiers to C. It consists of two modules representing a user 
> level and a lower level.
>
> The user level module uses the lower level module and its conversion facility 
> to generate C identifiers for import guards, macro constants including 
> qualified enumeration values in MACRO_CASE; and for types, variables and
> functions in snake_case; across three use cases: (1) qualified identifiers 
> with the module as a prefix, (2) local identifiers with a local suffix and 
> (3) top scope static identifiers without module prefix nor local suffix.
>
> https://github.com/m2sf/m2c/blob/main/src/m2c-ident-xlat.h
>
> The lower level module implements a key/value dictionary to store and 
> retrieve unqualified Modula-2 identifiers in any of 
> lower/upper/camel/title/mixed case and their C translations in snake case. 
> Every identifier is
> automatically converted and checked into the dictionary upon the first 
> retrieval request.
>
> https://github.com/m2sf/m2c/blob/main/src/lib/string/snake-case-conv.h
>
> The purpose of this library is to facilitate the generation of authentic 
> human readable C code from Modula-2 source code. I wrote this for my Modula-2 
> to C translator/via-C compiler, but it could also benefit GM2.
>
> The library is written in C99 and is LGPL licensed.
>
> Due to the license it could be incorporated into GM2 and then facilitate the 
> development of libraries to be used from C but written in Modula-2 and built 
> with GM2. For this, the symbols should conform to C conventions,
> which at present isn't the case. Note that any such symbol translation could 
> always be switched off by compiler switch. But if the feature is available, 
> it would make GM2 more useful for cases where somebody needs to
> deliver some C library but would rather write it in Modula-2.
>
> For anybody interested in the actual mapping of identifiers, the 
> specification is here:
>
> https://github.com/m2sf/m2c/wiki/Mapping-Modula-2-Identifiers-to-C
>
> I also intend to add a pragma so that one can supply a custom name.
>
> Gaius, I know you have plenty of work already, but perhaps you want to 
> consider incorporating this facility. If you need any adaptation, just drop 
> me a mail.
>
> regards
> benjamin

Hi Benjamin,

The issue of name mangling above is an interesting idea.

I think the default use should be simple so that the gdb user experience
will see procedure and identifier names as the same as the source code.
(Aiming to make it easy for first year undergrads).

The compiler should also allow more complex name mangling for advanced
use.

Currently
=========

In gm2 there are named paths which are prefixed to the gcc generated
symbol name.  So for example libraries may be different and by default
have named paths, so the m2pim libraries, m2iso libraries have path
names associated with the default locations.

For example the m2pim libraries might be installed at:

    $HOME/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.1/m2/m2pim/StrIO.def

and the driver gm2 sets up the named paths resulting in a call to
StrIO.WriteString appear as a call to an external function
m2pim_StrIO_WriteString.

Which allows ISO, PIM libraries to coexist even if they have the same
module name and a different interface.  (ISO Storage, PIM Storage, ISO
SYSTEM and PIM SYSTEM for example).

gm2 allows _ in any identifier and so it is possible to choose
identifiers which will clash with the name mangling schema above.

Proposed change
===============

I wonder if if the following algorithm would resolve the above issue:

In order of priority:

   0.  DEFINITION FOR "C".  Turns off default name mangling for the entire
       module.
   1.  <* gcc-name: foo_bar *>   The attribute will override the symbol
       name as given to the GCC backend.
   2.  <* gcc-mangle: (format specifiers to determine style of mangling)
       *>
   3.  Any symbol containing a leading or trailing or consecutive
       occurrences of lowline chars attracts a warning message.
   4.  Non exported identifiers appear as symbols with no mangling.       
   5.  The default namedpath__modulename__procedurename schema
       is applied.

The detail is in [2] above.  [1] and [2] can occur on a scope or per
identifier declaration.  Mangling specifiers were used in p2c iirc.
But I had thought that some of the format ideas could be taken from
https://github.com/gcc-mirror/gcc/blob/master/gcc/m2/gm2-compiler/M2MetaError.def
might be useful to drive/implement the format specifier code.  This would
allow users to specify the mangling schema on a per module or per
identifier basis if required.

As I understand it a LGPL library can't be used as part of GCC and there
are two legal prerequisites:

   1.  the licence should be GPLv3 for the compiler or GPLv3 with GCC runtime
       exemptions for a runtime library.
   2.  copyright has to be signed over to the FSF.

(but I stand to be corrected :-)

regards,
Gaius



reply via email to

[Prev in Thread] Current Thread [Next in Thread]