[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#36431: Crash in marker.c:337

From: Stefan Monnier
Subject: bug#36431: Crash in marker.c:337
Date: Tue, 02 Jul 2019 13:04:38 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

>> I don't really know how to reproduce your bug, but I think I have an
>> idea of what might be going on.
>> Can you try the patch below, to see if it fixes your problem?
> AFAICT, this patch moves the call to move_gap_both from a fragment
> where we must decode the inserted text to a fragment where such a
> decoding might not be necessary.  If I'm right, then this makes
> insert-file-contents slower in some cases, because moving the gap
> might be very expensive with large buffers.

Indeed.  It also removed the move_gap_both from the case where we need
to decode and we already know the coding-system to use.  So in some
cases it made it faster (these are the cases where it misbehaved).
The new version of the code shouldn't suffer from this performance
problem (it still calls move_gap_both in the set-auto-coding part of
the code, but this call should have a cost proportional to the amount
of buffer modification performed by set-auto-coding, i.e. it should be
a nop in pretty much all cases).

Looking at this aspect (i.e. not directly related to this bug) I'm
wondering why the code works this way:

We start by inserting the new bytes at the *beginning* of the gap, but
when we do the move_gap_both this moves those bytes to the *end* of the
gap (where decode_coding_gap expects them, apparently), so when we
decode we always have to move all the inserted bytes, right?

> More generally, I'd be leery to make significant changes ion
> insert-file-contents just to placate that single assertion.  What do
> we gain with that assertion except some theoretical correctness?

Again, I'm just trying to understand the code at this point.
The part that worries me is the following:

In the current code, we read the raw bytes to the beginning of the gap,
then (when Vset_auto_coding_function needs to be called), we (virtually)
move them into the current buffer, which is usually multibyte.
AFAICT at this point we have a buffer in a transiently inconsistent
state since it's multibyte but it can contain arbitrary byte sequences,
hence invalid byte sequences.  Before calling Vset_auto_coding_function
we make this buffer unibyte, which brings us back to a consistent state,
but I wonder if/how/why making the buffer unibyte and then back to
multibyte always preserves the original byte sequence, since AFAICT
set-buffer-multibyte will always make the effort to bring the buffer to
a consistent state, so if the state is inconsistent before the pair of
calls to set-buffer-multibyte, either we changed the byte sequence or 
set-buffer-multibyte doesn't always result in a consistent state.
What am I missing?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]