[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problem with editing/decoding utf-8 text

From: Stefan Monnier
Subject: Re: problem with editing/decoding utf-8 text
Date: 23 May 2003 17:20:24 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

> Now, no matter what I choose (raw-text, no-conversion, utf-8), it
> modifies all of the utf8 chars which are not fit into the ascii charset.
> It seems, that it inserts a \201 before every char which is not in the
> ascii charset. I.e. if I just load and save a file, emacs does not
> behaves transparently.

Do you also get the \201 if you choose `utf-8' ?
If so, it's definitely a bug.

> 0. What is this \201 byte?

An internal thing that you shouldn't see unless you ask to see it.
Using `raw-text' or `no-conversion' is debatably considered as "asking to
see it", but utf-8 definitely isn't, so if you see it with utf-8, it's
a bug.

> 1. Cannot I tell to a buffer (after the load of a file) that interpet it
> as binary, and save exactly the same bytes what it did read into the
> buffer (i.e. transparent buffer)?

If you save with the same coding-system as when you loaded, yes.
In your case, you loaded with a latin-1 coding-system and then saved with
another, so obviously Emacs had to do some conversion work and you don't
get the same sequence of byte.
Of course the fact that Emacs happily visited the file in latin-1 but then
refused to save it in latin-1 is a bug.  I vaguely seem to remember that
such a bug has been fixed in Emacs-CVS, but it would be great if you could
either check it or report a precise test case.

> 2. What is the difference between raw-text, no-conversion, binary? On
> some places, I can choose any of them, on other places not... This whole
> coding system is a nightmare... :(((

Yes it is but it's not all Emacs fault.  The only alternative would be for
Emacs to say "I only ever support 1 encoding".  The current code is
supposed to work just fine in this "single encoding" situation while also
allowing you to use other encodings if you want to.
Of course bugs, make this dream a bit less sweet.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]