gm2
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode I/O for GM2


From: Gaius Mulley
Subject: Re: Unicode I/O for GM2
Date: Tue, 26 Mar 2024 00:40:07 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Benjamin Kowarsch <trijezdci@gmail.com> writes:

> On Mon, 25 Mar 2024 at 01:24, Alice Osako wrote:
>
>  While the immediate concern is the test module, which is meant to show that 
> the test characters are correctly manipulated by displaying them to the
>  console, there is a general need for an I/O library for both file and 
> console I/O.
>
>  I've tried to solve this problem a few different ways, first using the ISO 
> RawIO operations Read and Write, then with the GCC Base library operations
>  ReadNBytes and WriteNBytes. While I have not tested how they work for file 
> I/O yet, for console I/O the displayed characters are being truncated to
>  display only the first byte of the wide character:
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>   0: 'a' [U+0061] is a valid codepoint; is a printable ASCII character; is in 
> the BMP. -> 97 -> 'a' 
>   1: 'a' [U+0061] is a valid codepoint; is a printable ASCII character; is in 
> the BMP. -> 97 -> 'a' 
>   2: 'a' [U+0061] is a valid codepoint; is a printable ASCII character; is in 
> the BMP. -> 97 -> 'a' 
>   3: ' ' [U+0120] is a valid codepoint; is not a printable ASCII character; 
> is in the BMP. -> 32 -> ' ' 
>   4: '�' [U+00C1] is a valid codepoint; is not a printable ASCII character; 
> is in the BMP. -> 193 -> '�' 
>   5: '�' [U+00C1] is a valid codepoint; is not a printable ASCII character; 
> is in the BMP. -> 193 -> '�' 
>   6: 'A' [U+0141] is a valid codepoint; is not a printable ASCII character; 
> is in the BMP. -> 65 -> 'A' 
>   7: '' [U+FFFD] is a valid codepoint; is not a printable ASCII character; is 
> in the BMP. -> 29 -> '' 
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> My hunch is that the console device driver used by GM2's ISO I/O library is 
> designed only for single-byte character output.
>
> If so, then the device driver will either need to be modified or 
> replaced/bypassed.
>
> To verify this will require some digging, though. Maybe Gaius can shed
> some light on it.

Hi Benjamin,

yes single byte character i/o, I've just added some test code to verify
this behaviour, which will appear in gcc/testsuite/gm2/pimlib/run/pass

regards,
Gaius


MODULE testchar ;

FROM FIO IMPORT File, OpenToWrite, OpenToRead,
                Close, WriteChar, ReadChar, IsNoError ;

FROM libc IMPORT printf, exit ;


(*
   createFile -
*)

PROCEDURE createFile ;
VAR
   fo: File ;
   ch: CHAR ;
BEGIN
   fo := OpenToWrite ("test.txt") ;
   FOR ch := MIN (CHAR) TO MAX (CHAR) DO
      WriteChar (fo, ch) ;
      IF NOT IsNoError (fo)
      THEN
         printf ("failure to write: %c\n", ch);
         exit (1)
      END
   END ;
   Close (fo)
END createFile ;


(*
   readFile -
*)

PROCEDURE readFile ;
VAR
   fi    : File ;
   ch, in: CHAR ;
BEGIN
   fi := OpenToRead ("test.txt") ;
   FOR ch := MIN (CHAR) TO MAX (CHAR) DO
      in := ReadChar (fi) ;
      IF NOT IsNoError (fi)
      THEN
         printf ("failure to read: %c\n", ch);
         exit (1)
      END ;
      IF ch # in
      THEN
         printf ("failure to verify: %c\n", ch);
         exit (1)
      END
   END ;
   Close (fi)
END readFile ;


(*
   init -
*)

PROCEDURE init ;
BEGIN
   createFile ;
   readFile
END init ;


BEGIN
   init
END testchar.


$ od -x test.txt
0000000 0100 0302 0504 0706 0908 0b0a 0d0c 0f0e
0000020 1110 1312 1514 1716 1918 1b1a 1d1c 1f1e
0000040 2120 2322 2524 2726 2928 2b2a 2d2c 2f2e
0000060 3130 3332 3534 3736 3938 3b3a 3d3c 3f3e
0000100 4140 4342 4544 4746 4948 4b4a 4d4c 4f4e
0000120 5150 5352 5554 5756 5958 5b5a 5d5c 5f5e
0000140 6160 6362 6564 6766 6968 6b6a 6d6c 6f6e
0000160 7170 7372 7574 7776 7978 7b7a 7d7c 7f7e
0000200 8180 8382 8584 8786 8988 8b8a 8d8c 8f8e
0000220 9190 9392 9594 9796 9998 9b9a 9d9c 9f9e
0000240 a1a0 a3a2 a5a4 a7a6 a9a8 abaa adac afae
0000260 b1b0 b3b2 b5b4 b7b6 b9b8 bbba bdbc bfbe
0000300 c1c0 c3c2 c5c4 c7c6 c9c8 cbca cdcc cfce
0000320 d1d0 d3d2 d5d4 d7d6 d9d8 dbda dddc dfde
0000340 e1e0 e3e2 e5e4 e7e6 e9e8 ebea edec efee
0000360 f1f0 f3f2 f5f4 f7f6 f9f8 fbfa fdfc fffe
0000400



reply via email to

[Prev in Thread] Current Thread [Next in Thread]