emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multibyte and unibyte file names


From: Eli Zaretskii
Subject: Re: Multibyte and unibyte file names
Date: Thu, 24 Jan 2013 18:37:35 +0200

> From: Michael Albinus <address@hidden>
> Date: Wed, 23 Jan 2013 21:58:59 +0100
> Cc: address@hidden, address@hidden
> 
> Eli Zaretskii <address@hidden> writes:
> 
> > For example, in the particular case of file-name-directory, I think
> > Tramp should simply do its job by a straightforward removal of the
> > portion after the last slash in Lisp, instead of calling the native
> > implementation.
> 
> This would duplicate code. I try to avoid, when possible.

I think we have no choice in this case.  There's no reason to assume
that processing remote file names with code that is based on local
filesystem will DTRT.  If it works, it's by sheer luck.

> >> I agree, Tramp shall check carefully what a file name encoding is. This
> >> must be added to the code.
> >
> > Sorry, I don't follow.  File names in Lisp are not encoded in any
> > way.  You only need to encode them when you pass them to commands
> > executed on the remote host, and decode the results that are output by
> > those remote commands.
> 
> Maybe there's a misunderstanding here. But you gave an example with a
> file name with japanese codings.

They were not encoded file names.  They were file names with Japanese
characters, but in the "usual" internal representation used by Emacs
for buffers and strings.  No encoding is involved.

> >> There might be a chance to switch to en_US.UTF-8 on the remote side. But
> >> even here I would propose to start with the unibyte subset. "en_US",
> >> because Tramp parses the output of commands, which must not be
> >> localized.
> >
> > Why "must not be localized"?
> 
> Tramp does not understand German messages, for example. "de_DE.UTF-8"
> would be a no-go. That's why Tramp sets the remote locale to English
> messages.

You can force English for messages, but still have file names be in
UTF-8, no?

> >> Other encodings but UTF-8 will be hard to support. It is not only that
> >> Tramp calls "native" file name primitives, there are also several
> >> parsing routines for commands on the remote side, which have their
> >> expectations on file name syntax and their encodings.
> >
> > I'm afraid I don't follow here, either.  Emacs is well equipped to
> > do code conversions from and to almost any encoding out there.  The
> > only problem is to know which encoding to use when communicating with
> > the commands on the remote host.  What am I missing?
> 
> Maybe one could teach Tramp to convert file names in whatever coding to
> UTF-8.

All you need is call decode-coding-string.

> But shall we do it? And how would that work with other Emacs
> flavors? Yes, I must keep XEmacs in mind.

I'd be surprised if XEmacs didn't support decode-coding-string or
UTF-8.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]