[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Texinfo 7.1 released
From: |
Gavin Smith |
Subject: |
Re: Texinfo 7.1 released |
Date: |
Mon, 23 Oct 2023 19:52:49 +0100 |
On Mon, Oct 23, 2023 at 04:41:17PM +0300, Eli Zaretskii wrote:
> Bingo. This brings the time for producing the ELisp manual down to
> 15.4 sec, 5 sec faster than v7.0.3.
>
> I see that btowc linked into the XSParagraph module is a MinGW
> specific implementation, not from the Windows-standard MSVCRT (where
> it is absent). My conclusion is that the MinGW btowc is extremely
> inefficient.
Great. Hopefully it helps you to be more productive on working on
documentation.
I propose the following, more finished patch, which applies
to Texinfo 7.1. We can also do something similar for the master branch.
I am making a release/7.1 branch for this fix, so it will be included
if there is ever a bug fix release 7.1.1. However, I am not going to
make a bug fix release with just this change in it.
https://git.savannah.gnu.org/cgit/texinfo.git/commit/?h=release/7.1&id=c76bcd0feed005aaf9db28a76f4883f3ae98295b
If we do move away from locale-based character processing in the paragraph
formatter, then we will not be using mbrtowc or btowc in the future.
diff --git a/ChangeLog b/ChangeLog
index e619109f5b..c4379ec56b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2023-10-23 Gavin Smith <gavinsmith0123@gmail.com>
+
+ * tp/Texinfo/XS/xspara.c (get_utf8_codepoint):
+ Wrapper for mbrtowc/btowc.
+ [_WIN32]: Do not call btowc, as it was tested to be very slow
+ on MinGW. Report from Eli Zaretskii.
+
2023-10-18 Gavin Smith <gavinsmith0123@gmail.com>
Texinfo 7.1
diff --git a/tp/Texinfo/XS/xspara.c b/tp/Texinfo/XS/xspara.c
index 7c6895a7ff..e1cddcdc2a 100644
--- a/tp/Texinfo/XS/xspara.c
+++ b/tp/Texinfo/XS/xspara.c
@@ -684,6 +684,30 @@ xspara_end (void)
/* characters triggering an end of sentence */
#define end_sentence_characters ".?!"
+/* Wrapper for mbrtowc. Set *PWC and return length of codepoint in bytes. */
+size_t
+get_utf8_codepoint (wchar_t *pwc, const char *mbs, size_t n)
+{
+#ifdef _WIN32
+ /* Use the above implementation of mbrtowc. Do not use btowc as
+ does not exist as standard on MS-Windows, and was tested to be
+ very slow on MinGW. */
+ return mbrtowc (pwc, mbs, n, NULL);
+#else
+ if (!PRINTABLE_ASCII(*mbs))
+ {
+ return mbrtowc (pwc, mbs, n, NULL);
+ }
+ else
+ {
+ /* Functionally the same as mbrtowc but (tested) slightly quicker. */
+ *pwc = btowc (*mbs);
+ return 1;
+ }
+#endif
+}
+
+
/* Add WORD to paragraph in RESULT, not refilling WORD. If we go past the end
of the line start a new one. TRANSPARENT means that the letters in WORD
are ignored for the purpose of deciding whether a full stop ends a sentence
@@ -730,18 +754,7 @@ xspara__add_next (TEXT *result, char *word, int word_len,
int transparent)
if (!strchr (end_sentence_characters
after_punctuation_characters, *p))
{
- if (!PRINTABLE_ASCII(*p))
- {
- wchar_t wc = L'\0';
- mbrtowc (&wc, p, len, NULL);
- state.last_letter = wc;
- break;
- }
- else
- {
- state.last_letter = btowc (*p);
- break;
- }
+ get_utf8_codepoint (&state.last_letter, p, len);
}
}
}
@@ -1013,16 +1026,7 @@ xspara_add_text (char *text, int len)
}
/************** Not a white space character. *****************/
- if (!PRINTABLE_ASCII(*p))
- {
- char_len = mbrtowc (&wc, p, len, NULL);
- }
- else
- {
- /* Functonally the same as mbrtowc but (tested) slightly quicker. */
- char_len = 1;
- wc = btowc (*p);
- }
+ char_len = get_utf8_codepoint (&wc, p, len);
if ((long) char_len == 0)
break; /* Null character. Shouldn't happen. */
- Re: Texinfo 7.1 released, (continued)
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/22
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/22
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/22
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/22
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/22
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/22
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/22
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/22
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/22
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/23
- Re: Texinfo 7.1 released,
Gavin Smith <=
- Re: Texinfo 7.1 released, Eli Zaretskii, 2023/10/25
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/25
- Re: Texinfo 7.1 released, Gavin Smith, 2023/10/22