bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31837: 26.1; replace-buffer-contents doesn't work if buffer has mult


From: Milan Stanojević
Subject: bug#31837: 26.1; replace-buffer-contents doesn't work if buffer has multibyte characters
Date: Mon, 18 Jun 2018 13:44:08 -0400

I haven't had a chance to read this email or try your patch until today.
I just tried it and it seems to work.

Thanks for the quick fix, Eli!

Not sure that [enable_multibyte_characters] check buys us much in
practice. At least in my case, it is always true. But what matters is
whether there are actual multibyte characters in the buffer, and
buf_charpos_to_bytepos already shortcircuits if there are none.

In any case, it probably doesn't matter one way or the other in
practice, but we'll keep a look on any complaints about performance.
We are starting to migrate our code base to use automatic code
formatting and we use this for applying the formatting changes so we
should have a lot of devs using this soon.

On Fri, Jun 15, 2018 at 4:34 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> From: mstanojevic@janestreet.com (Milan Stanojević)
>> Date: Thu, 14 Jun 2018 17:34:27 -0400
>>
>> Here is a small recipe that illustrates the bug.
>>
>> recipe.el
>> ---------
>> (setq use-multibyte (< 0 (length argv)))
>>
>> (switch-to-buffer "file1")
>> (when use-multibyte (insert-char (char-from-name "SMILE")))
>> (insert "1234")
>>
>> (switch-to-buffer "file2")
>> (when use-multibyte (insert-char (char-from-name "SMILE")))
>> (insert "5678")
>>
>> (replace-buffer-contents "file1")
>>
>> (princ (buffer-substring-no-properties (point-min) (point-max)))
>> (princ "\n")
>> -------------
>>
>> Running the recipe as script
>>
>> $ emacs -Q --script /tmp/recipe.el
>> 1234
>>
>> $ emacs -Q --script /tmp/recipe.el multibyte
>> ⌣5234
>>
>> In the first run, with just ascii characters, everything works as
>>  expected.
>>
>> In the second run, with multibyte characters, the function didn't
>>  replace '5' with '1' as expected.
>
> Right, thanks for catching this blunder.
>
>> I looked at the code and it looks to me like there is a very obvious bug
>>  in function buffer_chars_equal in editfns.c. It calls
>>  BUF_FETCH_CHAR_AS_MULTIBYTE passing *character* positions, but the
>>  macro expects *byte* positions. (it would be nice if these char vs byte
>>  positions could be distinguished with types, but I'm not sure it is
>>  possible in C).
>>
>> The simple fix is to replace BUF_FETCH_CHAR_AS_MULTIBYTE with
>>  STRING_CHAR (BUF_CHAR_ADDRESS (buf, pos)) and this seems to work.
>
> Thanks, I installed a slightly different fix on the emacs-26 branch,
> perhaps you could try that.
>
>> I'm not sure about performance of the above fix, though, because
>>  accessing random character position in a buffer is not constant. If
>>  diffing algorithm is accessing buffer positions in more or less
>>  localized manner, maybe it makes sense to move the point inside
>>  buffer_chars_equal so the char position to byte position conversion is
>>  fast. It probably doesn't matter for small files.
>
> The conversion of character positions to byte positions should be
> quite fast.  It got even faster on the master branch, so I think we
> are good.  If there are reports about slow operation of this function,
> we could in the future try to optimize it more.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]