[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23076: 24.5; vc-git: add a new variable for log output coding system

From: Eli Zaretskii
Subject: bug#23076: 24.5; vc-git: add a new variable for log output coding system
Date: Mon, 04 Apr 2016 18:22:30 +0300

> From: Nikolay Kudryavtsev <address@hidden>
> Cc: address@hidden
> Date: Sun, 3 Apr 2016 23:34:13 +0300
> Hello Eli.
> Just to explain the underlying issue.


> With emacs -Q try committing to the same repository by copy-pasting the 
> previous commit message. Then do git log from shell. Your commit message 
> would get broken.
> This happens because git on Windows expects the commit message to be in your 
> Windows "language for non-Unicode programs" encoding. Then it recodes from it 
> to utf-8.

I think this conclusion is wrong.  The real reason for the problem is
that Emacs on Windows invokes subordinate programs in a way that
non-ASCII characters in the command-line arguments can only be encoded
in the system codepage.  And Emacs uses the -m command-line argument
to pass the commit log message to Git.  IOW, the problem is not with
Git, the problem is with how Emacs on Windows invokes it.  (For
complicated reasons I won't go into, this general problem cannot be
easily fixed in Emacs.)

So any non-ASCII text encoded in some encoding other than the current
system codepage will become garbled even before it gets to Git.

> So, to be able to commit in russian we:
> 1. Change language for non-Unicode programs to russian.
> 2. (setq vc-git-commits-coding-system 'windows-1251)

This solution doesn't really work for the reasons explained above.

> After doing this, commiting in russian would work. But now our C-x v l is 
> broken.

"C-x v l" is broken because it uses the same value of
vc-git-commits-coding-system to read what Gt outputs, whereas Git
outputs in UTF-8.

> We can either fix it by setting logoutputencoding in git, but this would 
> break git log outside of emacs, or add a new variable to vc, and that's what 
> I want.

I don't think this is the right solution, see below.

> That's a relatively recent change in git, from 2013 or 2014, so if you're 
> using some really old version, everything might just work out of box. 

I have Git 2.8.0, the latest official release.

Since the problem is (a) specific to MS-Windows, and (b) related to
encoding the command-line arguments, the solution should target the
root cause and nothing else, IMO.  Introducing a separate variable
that users should need to configure sounds therefore as not the best
idea.  Moreover, on MS-Windows any value of that additional variable
that is not exactly equal to the current system codepage will simply
fail to work.

So instead, I can suggest one of the following alternatives, to be
done only when invoking Git to commit on MS-Windows:

 1) ignore vc-git-commits-coding-system and always encode the
    command-line arguments using the system locale (in your case,
    codepage 1251); or

 2) put the log message in a temporary file, encoded in
    vc-git-commits-coding-system, then use -F instead of -m; the rest
    of command-line arguments will be encoded in the system locale's

The 1st solution is essentially what you wanted, but without the need
to introduce an additional variable or ask the users to configure it.

The 2nd solution is somewhat slower, but it is better, because it will
allow to write log messages using any characters, not just those
representable in the current codepage.  Note that it still doesn't
solve all the problems with non-ASCII characters, because those could
be in the "author" or any of the other arguments with which we call
Git, such as the names of the files whose changes are to be committed
(as Emacs does support arbitrary characters in file names).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]