help-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gmake UTF-8 Support?


From: Eli Zaretskii
Subject: Re: Gmake UTF-8 Support?
Date: Thu, 30 Sep 2021 10:37:37 +0300

> Date: Wed, 29 Sep 2021 12:43:57 -0700
> From: "Kaz Kylheku (gmake)" <729-670-0061@kylheku.com>
> Cc: nobozo@gmail.com, help-make@gnu.org
> 
> > Yes, but which run-time library?  Cygwin adds a special, very large,
> > run-time library on top of the C runtime that comes with Windows.
> 
> [Hopefully, I have all the straight dope here; bear with me.]

Unfortunately, you haven't...

> Cygwin isn't on top of any other C run-time; it is based on something
> called Newlib: a BSD-licensed library. (Google's Bionic is based on
> Newlib, I think; or so I guessed when looking at some of the sources
> at one point.)

Cygwin also has a lot of stuff that doesn't come from newlib, but is
needed to provide Posix compatibility on top of Windows.

But that again is beside the point I wanted to make.

> No C run time for application use comes with Windows; only operating
> API DLL's like kernel32, user32, ...
> 
> Windows provides no stdio, no malloc, nothing: or none that are
> documented for developers to use.

The so-called "CRT" functions, including malloc and stdio, are
documented on the Microsoft documentation site, and so programs can
use them if the link against MSVCRT.DLL, which is still available in
the most recent versions of Widnows.

> >> Every program on Windows must bring its own run-time library;
> >> Windows doesn't provide one.
> > 
> > That's incorrect.  Windows does provide a stock runtime library, it's
> > called MSVCRT.DLL.
> 
> Right, but that is undocumented for application use which makes it
> de facto off limits to anyone who knows what's good for them.

I don't see how that follows.  In particular, MinGW, the "Minimal GNU
on Windows" development environment, uses it as its C runtime.

> It's a C library that some programs that are part of Microsoft Windows
> use.

It includes a full suite of C runtime functions.

> See here: https://devblogs.microsoft.com/oldnewthing/20140411-00/?p=1273

That blog is written from the POV of users of MSVC, the Microsoft's
proprietary C/C++ compiler.  That is of no interest for us here, I
hope, as we are talking about using Free Software, which means GCC and
GNU Binutils, to compile Make.  We cannot use the proprietary MSVCRnn
libraries distributed with MSVC, because that would be a violation of
the GPL.

> If you link to this, you're sticking a fork into Windows.

I disagree.  Again, MinGW is doing that all the time, and is a
first-class citizen in today's development of Free Software on
Windows.

> No professional grade application for Windows should be doing this,
> so for most practical purposes, it should be regarded as nonexistent.

Ha-ha, very funny.  I guess there's then a lot of us "unprofessional"
types that do that.  Look around for native Windows ports of GNU
software, and you will find that they all were compiled by MinGW and
link against MSVCRT.DLL.  Starting with Git for Windows, for example.

> > The redistributables only add newer functions above the default, so
> > that your program could run on older versions of Windows and still use
> > functions added in later versions.
> 
> The redistributable run-time from Microsoft Visual C is a complete,
> self-contained library. It's own malloc, stdio, and everything else;
> it does not rely on the MSVCRT.DLL in the system folder.

That is true, but it doesn't in any way contradict what I said.

> The MinGW make and shell themselves link to something called MSYS.DLL.
> 
> What is that? It's a fork of an ancient version of Cygwin!

You are confusing MinGW with MSYS.  MSYS is indeed a fork of Cygwin,
and is an environment used for building MinGW programs (where you need
to run Bash and other Posixish tools).  MinGW programs, by contrast,
are native Windows applications that don't need any support libraries
except MSVCRT.DLL and other Windows system DLLs (such as kernel32.dll
etc.).

> (Therefore, it's possible that UTF-8 might work with the MinGW make,
> but I have no desire to  blow the dust off that obsolete cruft to
> test this. It perhaps depends on how good the UTF-8 support
> was in the old version of Cygwin from which it was forked, and
> whether that support was preserved in the fork.)

UTF-8 will not work with MinGW programs because Windows C runtime (no
matter if that's MSVCRT.DLL or the newer MSVCRnn.DLL variants) doesn't
support UTF-8 as a first-class encoding for file names.  (Windows 10
made that support somewhat better, but it is still incomplete and
therefore opt-in, disabled by default.)  A native Windows program that
wants to support UTF-8 encoded file names must convert UTF-8 to UTF-16
on the application level, and then use the so-called "wide" or
"Unicode" APIs or C runtime functions (like _wfopen instead of fopen
etc.) that accept strings of wchar_t "wide characters" instead of
'char *' strings.  That's what Emacs on MS-Windows does, for example,
to be able to support file names that cannot be encoded by the system
codepage.  GNU Make has no such code, so it cannot support UTF-8
encoded file names if compiled as a native application, be that with
MSVC using MSVCRnn libraries or with MinGW using MSVCRT.DLL.

When a GNU program such as Make is compiled without source-level
changes of file-access functionalities, it can only support file names
encoded in the current system codepage, and Windows still doesn't
allow setting a UTF-8 codepage as the system one in all contexts, even
on Windows 10 and even after turning on that optional feature.  That
deficiency in native Windows APIs and CRT functions is the root cause
why UTF-8 encoded file names in Makefiles cannot be supported on
MS-Windows.

> (Speaking of Cygwin and MinGW: Cygwin has MinGW compilers as a
> package. If you think building programs in the MinGW way is
> a good idea, you can use Cygwin for that: just install the MinGW
> compiler package(s) and refer to those. You will have a much
> better environment with more up-to-date tools, for which it is
> much easier to build new tools.)

Yes, and the MinGW tools provided by Cygwin produce programs that link
with MSVCRT.DLL.  Because they are just MinGW cross-tools, nothing
more nothing less.  Using them is like using a cross-compilation MinGW
environment on GNU/Linux.  The products are MinGW programs, which are
native Windows PE executables that are linked against MSVCRT.DLL.

> > MSVCRT.DLL, which is part of the OS and thus does not cause violation
> > of the GPL.
> 
> While that is true, it causes a violation of your contract with
> MS Windows.

No, it doesn't.  Exactly like linking against kernel32.dll doesn't.
These are system libraries, and the GPL allows linking with them.

> There is no documented requirement anywhere about what is in this
> library, or what will be in it in the next version of Windows, or
> whether it will exist at all, under what name, with what calling
> conventions, etc.

That is simply incorrect.  The CRT functions are fully documented, and
Microsoft at some time even published the source code of MSVCRT.DLL (I
still have it on my machine) as part of their SDK, so this stuff is
well documented, no secret to anyone who wants to know.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]