--- Begin Message ---
Subject: |
25.0.50; [PATCH] byte-to-position has internal off-by-one bug |
Date: |
Wed, 10 Jun 2015 17:13:30 +0200 |
User-agent: |
Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.0.50 (berkeley-unix) |
Here's a test case for the bug:
(with-temp-buffer
(insert "éé")
(let ((i 1) pos res)
(while (setq pos (byte-to-position i))
(push pos res)
(setq i (1+ i)))
(nreverse res)))
=> (1 2 2 2 3)
while the correct result is
=> (1 1 2 2 3)
I found that this bug had been noticed before in
http://stackoverflow.com/questions/17588117/emacs-byte-to-position-function-is-not-consistent-with-document
Here's a patch. The fix may look a bit clumsy but it's actually meant
to avoid pessimizing the presumably common case where the initial
bytepos is at a character boundary.
-- >8 --
Subject: [PATCH] * src/marker.c (buf_bytepos_to_charpos): Fix best_below_byte
count.
If bytepos is not after a character boundary the preceding loop
overshoots by one character position.
---
src/marker.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/marker.c b/src/marker.c
index 73928ba..94d676b 100644
--- a/src/marker.c
+++ b/src/marker.c
@@ -341,6 +341,12 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t
bytepos)
BUF_INC_POS (b, best_below_byte);
}
+ if (best_below_byte != bytepos)
+ {
+ best_below--;
+ BUF_DEC_POS (b, best_below_byte);
+ }
+
/* If this position is quite far from the nearest known position,
cache the correspondence by creating a marker here.
It will last until the next GC.
--
2.4.2
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#20783: 25.0.50; [PATCH] byte-to-position has internal off-by-one bug |
Date: |
Wed, 17 Jun 2015 14:19:10 +0200 |
User-agent: |
Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.0.50 (berkeley-unix) |
Version: 25.1
On Thu, Jun 11 2015, Eli Zaretskii wrote:
> Works for me, thanks. But please add a comment there about
> BYTE_TO_CHAR expecting byte positions that are on a character
> boundary, so that the reason for the loop is clear.
Done and pushed.
However, I didn't follow my own suggestion of adding a remark to the
comment above BYTE_TO_CHAR, after all, as it is true for most macros or
functions with a byte position argument, so adding such a comment to
just one of them could be confusing, I think.
Thank you for steering this change in the right direction.
--- End Message ---