emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: test-org-table/sort-lines: Failing test on macOS


From: Max Nikulin
Subject: Re: test-org-table/sort-lines: Failing test on macOS
Date: Wed, 23 Nov 2022 22:27:35 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 23/11/2022 17:37, Ihor Radchenko wrote:
Max Nikulin writes:

Strings to sort are passed either through `identity' or
through `downcase'.

Thanks for the pointer.
Now, I am getting more confused though.
Do we even need to use `string-collate-lessp' then?

I think we do because sort result is presented to humans.

(setq lst '("semana" "señor" "sepia"))
(sort lst #'string-lessp) ;         => ("semana" "sepia" "señor")
(sort lst #'string-collate-lessp) ; => ("semana" "señor" "sepia")

Eli even argued that `string-collate-lessp' is strictly worse compared
to more predictable approach. See
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=59275#40

In this particular case Eli may assume that e.g. list is a elisp structure, not a kind of text formatting. In general, I am quite pessimistic concerning quality of locales support in Emacs while Eli may have rather different point of view.

Do you remember any cases when users actually demanded locale-specific
sorting?

I think, users too often face poor locale support in various applications, so they are not surprised when see incorrect results. In some sense such results are consistent (erroneous in the same way).

Formatting of numbers in Emacs is the extreme case of consistency. For the sake of reliably reading/writing of numbers from/to files or network it is impossible to present a number accordingly to the current locale. An exception is en_US that has some dedicated code in calc.

I believe, it is silly to adhere to a common denominator and to not use `string-collate-lessp' just because it is unavailable in some environments.

However, I feel a bit lost about what to do on Org side.
We can put a disclaimer in the manual and all that, but it still feels
too complex.

My current suggestion is to provide a fallback to `downcase' in the code and to explain in the manual that runtime environments (OSes) are not equal and quality of locale support varies. Emacs heavily depends on libc in this area.

However I am afraid of compatibility shims after

d3a9c424b 2022-08-16 17:15:27 +0800 Ihor Radchenko: org-encode-time:
Refactor into top-level `defmacro'

What do you refer to?

Implementation must be chosen at compile (or load) time. Due to some issues with native compiling it does not work. For string comparison runtime performance penalty may be higher than for timestamp processing.

The question is what can be done and, more importantly, how much effort
will it take to implement and maintain an alternative.

Effort is significant however e.g. browsers have their own implementation of Unicode-related stuff. There is ICU library, but Eli is against it because Emacs already has partial implementation of Unicode and it would mean duplication of character database.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]