[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Allow inserting non-BMP characters
From: |
Eli Zaretskii |
Subject: |
Re: [PATCH] Allow inserting non-BMP characters |
Date: |
Tue, 26 Dec 2017 18:11:18 +0200 |
> From: Philipp Stephani <address@hidden>
> Date: Tue, 26 Dec 2017 10:35:42 +0000
> Cc: address@hidden, address@hidden
>
> Suggest to move surrogates_to_codepoint to coding.c, and then use the
> macros UTF_16_HIGH_SURROGATE_P and UTF_16_LOW_SURROGATE_P defined
> there.
>
> Hmm, I'd rather go the other way round and remove these macros later. They
> are macros, thus worse than
> functions,
I don't think we have a policy to prefer inline functions to macros,
and I don't think we should have such a policy. We use inline
functions when that's necessary, but we don't in general prefer them.
They have their own problems, see the comments in lisp.h for some of
that.
> and don't seem to be correct either (what about a value such as 0x11DC00?).
??? They care correct for UTF-16 sequences, which are 16-bit numbers.
If you need to augment them by testing the high-order bits to be zero
in your case, that's okay, but I don't see any need for introducing
similar but different functionality.
> No new macros please if we can avoid it. Functions are strictly better.
Sorry, I disagree. Each has its advantages, and on balance I find
macros to be slightly better, certainly not worse. There's no need to
avoid them in C.
> I don't care much whether they are in character.h or coding.h, but
> char_surrogate_p is already in character.h.
char_surrogate_p should have used the coding.h macros as well.
> > + USE_SAFE_ALLOCA;
> > + unichar *utf16_buffer;
> > + SAFE_NALLOCA (utf16_buffer, 1, len);
>
> Maximum length of a UTF-16 sequence is known in advance, so why do you
> need SAFE_NALLOCA here? Couldn't you use a buffer of fixed length
> instead?
>
> The text being inserted can be arbitrarily long. Even single characters (i.e.
> extended grapheme clusters) can
> be arbitrarily long.
Yes, but why do you first copy the input into a separate buffer? Why
not convert each UTF-16 sequence separately, as you go through the
loop?
- [PATCH] Allow inserting non-BMP characters, Philipp Stephani, 2017/12/25
- Re: [PATCH] Allow inserting non-BMP characters, Alan Third, 2017/12/25
- Re: [PATCH] Allow inserting non-BMP characters, Eli Zaretskii, 2017/12/25
- Re: [PATCH] Allow inserting non-BMP characters, Philipp Stephani, 2017/12/26
- Re: [PATCH] Allow inserting non-BMP characters,
Eli Zaretskii <=
- Re: [PATCH] Allow inserting non-BMP characters, Philipp Stephani, 2017/12/26
- Re: [PATCH] Allow inserting non-BMP characters, Eli Zaretskii, 2017/12/26
- Re: [PATCH] Allow inserting non-BMP characters, Alan Third, 2017/12/26
- Re: [PATCH] Allow inserting non-BMP characters, Eli Zaretskii, 2017/12/26
- Re: [PATCH] Allow inserting non-BMP characters, Alan Third, 2017/12/28
- Re: [PATCH] Allow inserting non-BMP characters, Philipp Stephani, 2017/12/28
- Re: [PATCH] Allow inserting non-BMP characters, Eli Zaretskii, 2017/12/28
- Re: [PATCH] Allow inserting non-BMP characters, Philipp Stephani, 2017/12/29
- Re: [PATCH] Allow inserting non-BMP characters, Eli Zaretskii, 2017/12/29