gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gm2 internal compiler error report


From: Alice Osako
Subject: gm2 internal compiler error report
Date: Sun, 10 Mar 2024 03:31:18 -0400
User-agent: Mozilla Thunderbird

I was working on my Unicode project, trying to apply various BITSET and BitWordOps operations, when I got a 'internal compiler error',  something I had not encountered before.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

// Target: x86_64-pc-linux-gnu
// Configured with: /home/schol-r-lea/Deployments/gm2/gcc/configure --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --prefix=/home/schol-r-lea/opt --bindir=/home/schol-r-lea/opt/bin --libdir=/home/schol-r-lea/opt/lib --libexecdir=/home/schol-r-lea/opt/lib --enable-threads=posix --enable-clocale=gnu --disable-multilib --disable-bootstrap --enable-checking --enable-languages=m2 : (reconfigured) /home/schol-r-lea/Deployments/gm2/gcc/configure --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --prefix=/home/schol-r-lea/opt --bindir=/home/schol-r-lea/opt/bin --libdir=/home/schol-r-lea/opt/lib --libexecdir=/home/schol-r-lea/opt/lib --enable-threads=posix --enable-clocale=gnu --disable-multilib --disable-bootstrap --enable-checking --enable-languages=m2
// Thread model: posix
// Supported LTO compression algorithms: zlib zstd
// gcc version 13.2.1 20240223 (GCC)
//
//  _M2_UTF8_init _M2_UTF8_fini
// -quiet: internal compiler error: not expecting this kind of symbol
// 0x1f07117 internal_error(char const*, ...)
//     ???:0
// 0x8cbbd2 m2linemap_internal_error
//     ???:0
// 0x99651b M2Emit_InternalError
//     ???:0
// 0x8f5f6e M2Error_InternalError
//     ???:0
// 0x957233 SymbolTable_IsValueSolved
//     ???:0
// 0x8e1eed M2ALU_TryEvaluateValue
//     ???:0
// 0x8fb4d6 M2GCCDeclare_TryDeclareConstructor
//     ???:0
// 0x8fb542 M2GCCDeclare_TryDeclareConstant
//     ???:0
// 0x906e9e M2GenGCC_ResolveConstantExpressions
//     ???:0
// 0x938aec M2Scope_ForeachScopeBlockDo
//     ???:0
// 0x8f2242 M2Code_CodeBlock
//     ???:0
// 0x8df9aa Lists_ForeachItemInListDo
//     ???:0
// 0x8f23b6 M2Code_CodeBlock
//     ???:0
// 0x8f275a M2Code_Code
//     ???:0
// 0x8f35d1 M2Comp_compile
//     ???:0
// Please submit a full bug report, with preprocessed source.
// Please include the complete backtrace with any bug report.
// See <https://gcc.gnu.org/bugs/> for instructions.

// /home/schol-r-lea/opt/lib/gcc/x86_64-pc-linux-gnu/13.2.1/cc1gm2 -quiet -dumpdir bin/ -dumpbase UTF8.mod -dumpbase-ext .mod -mtune=generic -march=x86-64 -g -fiso -freport-bug -fm2-pathname=- -fm2-pathnameIdefs/ -fm2-pathnameI. -fscaffold-dynamic -flibs=m2iso,m2cor,m2pim,m2log -fm2-pathname=- -fm2-pathnameIdefs/ -fm2-pathnameI. impls/UTF8.mod -c -o - -frandom-seed=0 -fdump-noaddr


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The procedure which seems to have triggered this is

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PROCEDURE Utf8ToUnichar(utf8: UTF8Buffer; VAR ch: UNICHAR);
(*
   Utf8ToUnichar - Convert a buffer of UTF-8 characters
                to the internal UCS-4 format.

   Following RFC 3629 (https://www.rfc-editor.org/rfc/rfc3629#section-3),
   the mappings between UTF-8 and UCS-4 are as follows:

   Char. number range  |        UTF-8 octet sequence
      (hexadecimal)    |              (binary)
   --------------------+---------------------------------------------
   0000 0000-0000 007F | 0xxxxxxx
   0000 0080-0000 07FF | 110xxxxx 10xxxxxx
   0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
   0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

   Any candidate character which does not match these cases should be
   replaced with the REPLACEMENT CHAR.
*)


VAR
   edgeBit: SHORTCARD;
   subChar: ARRAY [1..3] OF BITSET;    (* holds the sub-components of the character *)
   i : CARDINAL;
   octet: BITSET;

BEGIN
   (* clear the output *)
   ch := 0;

   octet := 0;

   subChar[0] := 0;
   FOR i := 1 TO 3 DO
      octet := utf8[i];
      subChar[i] := octet - {6..7};
   END;


   (* Which is the last clear bit in the first byte? *)
   edgeBit := GetEdgeBit(utf8[0]);


   ch := utf8[0] - {7 .. edgeBit};

   CASE edgeBit OF
      7:
         (* A single-byte ASCII char, just use as-is *) |
      5:

         (* use two bytes for the value *)
         ch := WordOr(ch, WordShl(subChar[1], 6)); |
      4:
         (* use three bytes for the value *)
         ch := WordOr(ch, WordShl(subChar[1], 6));
         ch := WordOr(ch, WordShl(subChar[2], 12)); |
      3:
         (* use four bytes for the value *)
         ch := WordOr(ch, WordShl(subChar[1], 6));
         ch := WordOr(ch, WordShl(subChar[2], 12));
         ch := WordOr(ch, WordShl(subChar[3], 18));
   ELSE
      (* should never happen, return the REPLACEMENT CHAR *)
      ch := Replacement;
   END;

END Utf8ToUnichar;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The wider context for this can be found at the project repo, https://github.com/Schol-R-LEA/UNICODE-For-Modula-2/ , with the erroneous file being 'UTF8.mod'.

Despite the fact that it triggered an internal error for the compiler, I am certain that the code itself is deeply flawed. I am simply too inexperienced with Modula-2 bitwise operations to say what I have done wrong here, and unfortunately I lack a proper language reference to work from (recommendations would be welcome).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]