[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#44486: 27.1; C-@ chars corrupt elisp buffer

From: Stefan Monnier
Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer
Date: Sat, 14 Nov 2020 17:53:57 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

>> If `utf-8` is preferable over `prefer-utf-8` for this usage I think
>> the problem is in `prefer-utf-8` since it was introduced
>> specifically for that.
> The implementation doesn't support your POV.

Then I think the implementation is in error.

>> >> I believe if there's a NUL byte in such a files but it otherwise doesn't
>> >> contain any invalid UTF-8 byte sequence, it will result in better
>> >> behavior if we treat it as UFT-8 than as binary.
>> > We treat null bytes as the _single_ telltale sign of a binary file.
>> A .el file should *never* be a binary file.
> We are not talking about .el files, we are talking about _any_ file
> read using prefer-utf-8.

`prefer-utf-8` was not introduced because it seemed like a good idea and
then we hoped someone would find it useful.  It was introduced to solve
a concrete need, which is that of `.el` files.  It's quite possible that
there are other situations that have the same needs as `.el` files, but
from where I stand it looks like "the needs of .el files (and similar
cases)" should determine the intended behavior of `prefer-utf-8` rather
than its current implementation.

> For .el files, we can always bind inhibit-null-byte-detection to t
> when we load or visit such files.

We could, but I'm having trouble imagining a situation where we'd want
to use `prefer-utf-8` and not inhibit "NUL means binary".

The "NUL mean binarys" heuristic fundamentally says that `binary` is the
first coding system we try and only if this one fails (for lack of NUL
bytes) we consider others.  But for `prefer-utf-8` we should first
consider utf-8 and only if this fails should we consider others
(potentially including `binary` if you want, my opinion is not as strong

> I'm not talking about .el files.  The coding-system's applicability is
> wider than that.

Could be.  But it's its "raison d'ĂȘtre" (and AFAIK currently still the
sole application), so it should handle this case as best it can.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]