bug#44486: 27.1; C-@ chars corrupt elisp buffer

From: Eli Zaretskii
Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer
Date: Sun, 15 Nov 2020 17:08:17 +0200

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: larsi@gnus.org,  thievol@posteo.net,  handa@gnu.org,
>   schwab@linux-m68k.org,  44486@debbugs.gnu.org
> Date: Sat, 14 Nov 2020 17:53:57 -0500
> >> If `utf-8` is preferable over `prefer-utf-8` for this usage I think
> >> the problem is in `prefer-utf-8` since it was introduced
> >> specifically for that.
> > The implementation doesn't support your POV.
> Then I think the implementation is in error.

But that ship has sailed 7 years ago.

> > We are not talking about .el files, we are talking about _any_ file
> > read using prefer-utf-8.
> `prefer-utf-8` was not introduced because it seemed like a good idea and
> then we hoped someone would find it useful.  It was introduced to solve
> a concrete need, which is that of `.el` files.  It's quite possible that
> there are other situations that have the same needs as `.el` files, but
> from where I stand it looks like "the needs of .el files (and similar
> cases)" should determine the intended behavior of `prefer-utf-8` rather
> than its current implementation.
> > For .el files, we can always bind inhibit-null-byte-detection to t
> > when we load or visit such files.
> We could, but I'm having trouble imagining a situation where we'd want
> to use `prefer-utf-8` and not inhibit "NUL means binary".
> The "NUL mean binarys" heuristic fundamentally says that `binary` is the
> first coding system we try and only if this one fails (for lack of NUL
> bytes) we consider others.  But for `prefer-utf-8` we should first
> consider utf-8 and only if this fails should we consider others
> (potentially including `binary` if you want, my opinion is not as strong
> there).
> > I'm not talking about .el files.  The coding-system's applicability is
> > wider than that.
> Could be.  But it's its "raison d'ĂȘtre" (and AFAIK currently still the
> sole application), so it should handle this case as best it can.

We should have been having this discussion 7 years ago.  And guess
what? we did.  In that discussion, you said, in response to a question
from Kenichi:

   > * What to do with null byte detection.  Previously, if a
   >   *.el file contains a null byte and
   >   inhibit-null-byte-detection is nil (the default), it's
   >   detected as a binary file.  Now utf-8 is forced regardless
   >   of inhibit-null-byte-detection.

   I like the utf-8 better, but I don't know of any concrete case where it
   makes a significant difference, so either way is OK.
Note that what actually got implemented ignored
inhibit-null-byte-detection altogether, and _always_ considered the
file binary if any null byte was found.  My change, which prompted
this present discussion, made prefer-utf-8 heed the variable's value,
which is mid-way between what we had for 7 years and what you thought
we should have.  So, a small step forward ;-)

