gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gm2 internal compiler error report


From: Gaius Mulley
Subject: Re: gm2 internal compiler error report
Date: Sun, 10 Mar 2024 09:03:16 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Alice Osako <alicetrillianosako@gmail.com> writes:

> I was working on my Unicode project, trying to apply various BITSET and 
> BitWordOps operations, when I got a 'internal compiler error',  something I 
> had not
> encountered before.
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> // Target: x86_64-pc-linux-gnu
> // Configured with: /home/schol-r-lea/Deployments/gm2/gcc/configure 
> --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
> --prefix=/home/schol-r-lea/opt --bindir=/home/schol-r-lea/opt/bin 
> --libdir=/home/schol-r-lea/opt/lib --libexecdir=/home/schol-r-lea/opt/lib
> --enable-threads=posix --enable-clocale=gnu --disable-multilib 
> --disable-bootstrap --enable-checking --enable-languages=m2 : (reconfigured)
> /home/schol-r-lea/Deployments/gm2/gcc/configure --host=x86_64-pc-linux-gnu 
> --target=x86_64-pc-linux-gnu --prefix=/home/schol-r-lea/opt
> --bindir=/home/schol-r-lea/opt/bin --libdir=/home/schol-r-lea/opt/lib 
> --libexecdir=/home/schol-r-lea/opt/lib --enable-threads=posix 
> --enable-clocale=gnu
> --disable-multilib --disable-bootstrap --enable-checking --enable-languages=m2
> // Thread model: posix
> // Supported LTO compression algorithms: zlib zstd
> // gcc version 13.2.1 20240223 (GCC) 
> // 
> //  _M2_UTF8_init _M2_UTF8_fini
> // -quiet: internal compiler error: not expecting this kind of symbol
> // 0x1f07117 internal_error(char const*, ...)
> //     ???:0
> // 0x8cbbd2 m2linemap_internal_error
> //     ???:0
> // 0x99651b M2Emit_InternalError
> //     ???:0
> // 0x8f5f6e M2Error_InternalError
> //     ???:0
> // 0x957233 SymbolTable_IsValueSolved
> //     ???:0
> // 0x8e1eed M2ALU_TryEvaluateValue
> //     ???:0
> // 0x8fb4d6 M2GCCDeclare_TryDeclareConstructor
> //     ???:0
> // 0x8fb542 M2GCCDeclare_TryDeclareConstant
> //     ???:0
> // 0x906e9e M2GenGCC_ResolveConstantExpressions
> //     ???:0
> // 0x938aec M2Scope_ForeachScopeBlockDo
> //     ???:0
> // 0x8f2242 M2Code_CodeBlock
> //     ???:0
> // 0x8df9aa Lists_ForeachItemInListDo
> //     ???:0
> // 0x8f23b6 M2Code_CodeBlock
> //     ???:0
> // 0x8f275a M2Code_Code
> //     ???:0
> // 0x8f35d1 M2Comp_compile
> //     ???:0
> // Please submit a full bug report, with preprocessed source.
> // Please include the complete backtrace with any bug report.
> // See <https://gcc.gnu.org/bugs/> for instructions.
>
> // /home/schol-r-lea/opt/lib/gcc/x86_64-pc-linux-gnu/13.2.1/cc1gm2 -quiet 
> -dumpdir bin/ -dumpbase UTF8.mod -dumpbase-ext .mod -mtune=generic
> -march=x86-64 -g -fiso -freport-bug -fm2-pathname=- -fm2-pathnameIdefs/ 
> -fm2-pathnameI. -fscaffold-dynamic -flibs=m2iso,m2cor,m2pim,m2log
> -fm2-pathname=- -fm2-pathnameIdefs/ -fm2-pathnameI. impls/UTF8.mod -c -o - 
> -frandom-seed=0 -fdump-noaddr
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The procedure which seems to have triggered this is 
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> PROCEDURE Utf8ToUnichar(utf8: UTF8Buffer; VAR ch: UNICHAR);
> (*
>    Utf8ToUnichar - Convert a buffer of UTF-8 characters
>                 to the internal UCS-4 format.
>
>    Following RFC 3629 (https://www.rfc-editor.org/rfc/rfc3629#section-3),
>    the mappings between UTF-8 and UCS-4 are as follows:
>
>    Char. number range  |        UTF-8 octet sequence
>       (hexadecimal)    |              (binary)
>    --------------------+---------------------------------------------
>    0000 0000-0000 007F | 0xxxxxxx
>    0000 0080-0000 07FF | 110xxxxx 10xxxxxx
>    0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
>    0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
>
>    Any candidate character which does not match these cases should be
>    replaced with the REPLACEMENT CHAR.
> *)
>
> VAR
>    edgeBit: SHORTCARD;
>    subChar: ARRAY [1..3] OF BITSET;    (* holds the sub-components of the 
> character *)
>    i : CARDINAL;
>    octet: BITSET;
>
> BEGIN
>    (* clear the output *)
>    ch := 0;
>
>    octet := 0;
>
>    subChar[0] := 0;
>    FOR i := 1 TO 3 DO
>       octet := utf8[i];
>       subChar[i] := octet - {6..7};
>    END;
>
>    (* Which is the last clear bit in the first byte? *)
>    edgeBit := GetEdgeBit(utf8[0]);
>
>    ch := utf8[0] - {7 .. edgeBit};
>
>    CASE edgeBit OF
>       7:
>          (* A single-byte ASCII char, just use as-is *) |
>       5:
>
>          (* use two bytes for the value *)
>          ch := WordOr(ch, WordShl(subChar[1], 6)); |
>       4:
>          (* use three bytes for the value *)
>          ch := WordOr(ch, WordShl(subChar[1], 6));
>          ch := WordOr(ch, WordShl(subChar[2], 12)); |
>       3:
>          (* use four bytes for the value *)
>          ch := WordOr(ch, WordShl(subChar[1], 6));
>          ch := WordOr(ch, WordShl(subChar[2], 12));
>          ch := WordOr(ch, WordShl(subChar[3], 18));
>    ELSE
>       (* should never happen, return the REPLACEMENT CHAR *)
>       ch := Replacement;
>    END;
>
> END Utf8ToUnichar;
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The wider context for this can be found at the project repo, 
> https://github.com/Schol-R-LEA/UNICODE-For-Modula-2/ , with the erroneous 
> file being
> 'UTF8.mod'.
>
> Despite the fact that it triggered an internal error for the compiler,
> I am certain that the code itself is deeply flawed. I am simply too
> inexperienced with Modula-2 bitwise operations to say what I have done
> wrong here, and unfortunately I lack a proper language reference to
> work from (recommendations would be welcome).

Hi Alice,

The language references I use are:  "Programming in Modula-2" 4th Ed
(and 2nd, 3rd) by Niklaus Wirth.

   https://freepages.modula2.org/report4/modula-2.html

and the ISO m2 standard, which sadly is not available online.
(Also worth exploring: https://freepages.modula2.org/tutor.html).
Maybe someone on the list would know whether there are ISO m2 books
freely available?

Thanks for the bug report - I believe this is fixed in the latest git
gcc repro (gcc-14):

$ gm2 -g -Idefs -fiso -c impls/UTF8.mod 
impls/UTF8.mod:47:13: error: In procedure ‘GetExtChar’: type incompatibility 
between ‘CARDINAL’ and ‘BYTE’
   47 |    RETURN b - {7, 6};
      |           ~~^~~~~~~~

$ gm2 --version
gm2 (GCC) 14.0.1 20240308 (experimental)

I'll back port the fix to gm2 13.2

regards,
Gaius



reply via email to

[Prev in Thread] Current Thread [Next in Thread]