[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: on eshell's encoding

From: Eli Zaretskii
Subject: Re: on eshell's encoding
Date: Wed, 27 Jul 2016 19:22:05 +0300

> From: Yuri Khan <address@hidden>
> Date: Wed, 27 Jul 2016 19:15:45 +0600
> Cc: "address@hidden" <address@hidden>
> On Wed, Jul 27, 2016 at 6:56 PM, Daniel Bastos <address@hidden> wrote:
> > I meant not being messed with.  I don't know anything about MS-Windows.
> > In UNIX the creation of a new process by a shell is likely to call
> > execve, which won't touch the caller strings passed in through the
> > argv-argument.
> Well Windows is a different beast entirely. The basic premise is the
> same, in that the parent invokes CreateProcessW, passing a
> UTF-16-encoded command line, and the child process invokes
> GetCommandLineW and then optionally CommandLineToArgvW to split the
> command line into arguments.

So it isn't a different beast, really.  Both on Unix and on Windows,
Emacs encodes the command line before passing it to system APIs.  The
details differ, but not the basic idea.

> Problem is, most programs prefer to work internally with 8-bit-based
> encodings, and the Win32 API makes it very easy by providing backward
> compatibility wrapper functions CreateProcessA and GetCommandLineA,
> which unfortunately convert from/to the ANSI or OEM encoding defined
> by the locale.

Nitpicking: always ANSI, never the OEM.

> And there is no Win32 locale for which UTF-8 is either the ANSI or
> the OEM encoding.

It's actually worse than that: the Windows locale implementation
doesn't support variable-length encodings, so UTF-8 cannot be a
locale's encoding, unless MS change their related runtime libraries in
a radical way.

> This one point makes it very difficult to use Windows in the Unix Way:
> you get to worry about encodings on every process boundary.

Same on Unix, unless you are willing to bet on UTF-8 being the
locale's codeset.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]