[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] How to make TCC support utf8

From: Samir Ribić
Subject: Re: [Tinycc-devel] How to make TCC support utf8
Date: Thu, 9 Jun 2022 09:21:38 +0200

UTF-8  is compatible with char * because the codes between 0 and 127 are the same (in code and size). This is different from 16 bit Unicode. So, many functions that are intended for ANSI will work with UTF-8.  It is now a question of OS support. 

See the following program:

#include <stdio.h>
#include <windows.h>
void main() {
  char * m="Конференция u Čačku Οὐχὶ ταὐτὰ παρίσταταί \n";
 // SetConsoleOutputCP(65001);

Compile it and start it under Windows (I tried Windows 10) . The text is incorrectly written. Now uncomment the  line SetConsoleOutputCP(65001);
and  you will see the text..
However, strlen still returns string length in bytes, not in characters. Also m[5] accesses fifth byte, not fifth character. So, you need to prepare your own versions of string handling functions.

On Thu, Jun 9, 2022 at 6:34 AM Larry Doolittle via Tinycc-devel <tinycc-devel@nongnu.org> wrote:
lrd -

On Thu, Jun 09, 2022 at 12:01:09PM +0800, lrt via Tinycc-devel wrote:
> Who can tell me how to make TCC support utf8.
> I want to use the Unicode API.

Just .. don't.

‘Trojan Source’ Bug Threatens the Security of All Code
November 1, 2021
(as seen on slashdot)

  - Larry

Tinycc-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]