[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mbcel module for Gnulib?, incomplete multibyte sequences

From: Paul Eggert
Subject: Re: mbcel module for Gnulib?, incomplete multibyte sequences
Date: Mon, 7 Aug 2023 00:32:00 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 2023-08-04 16:05, Bruno Haible wrote:

To me, the columns timings of mbiterf and mbuiterf are good enough.

Not to me. Perhaps I'm used to apps like grep and diff where we try to get as much performance as we can (without going off the deep end of course).

There are tradeoffs here: mbcel wins on simplicity and performance, whereas mbiter wins on generality. Since the generality gains (namely, support for encodings that diff doesn't need) are small for diff, there is space for something like mbcel.

Emacs is a complex beast. I can understand if the Emacs developers want
an implementationally-simple behaviour, rather than a simple-from-the-
user-perspective behaviour.

I don't agree that the MEE approach is necessarily simpler from the user's perspective. Although it may be simpler for some apps, it's more complicated for others, and it's not surprising that Emacs, grep, diff, etc. take the SEE approach. I expect that Gnulib should support SEE for apps that prefer it. I'll try to squeeze free some time to think about how to do that.

For MEE, mbiterf would need something like the attached untested patch,
and mbiter, mbcel, etc. would all need similar patches.

Good point.

The attached patch implements that. Look good to you?

(Although maybe you may want to align the module name to be similar
to mbiterf and mbuiterf : maybe mbitervf and mbuitervf for "very fast"?)

I'll think about naming. I hope for something a bit easier to spell/remember than "mbuitervf". (To be honest I'm not sold on the existence of mviterf and mbuiterf, as they're slower than mbcel even if mbcel is changed to use MEE.)

More important to my mind is how apps choose between SEE and MEE. In some sense, the choice between SEE and MEE is orthogonal to the choice between mbcel and mbiter, as it'd be easy to modify mbcel to optionally support MEE and also easy to modify mbiter to optionally support SEE.

Attachment: 0001-mbiter-return-encoding-error-prefix-lengths.patch
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]