emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: test-org-table/sort-lines: Failing test on macOS


From: Max Nikulin
Subject: Re: test-org-table/sort-lines: Failing test on macOS
Date: Tue, 22 Nov 2022 23:01:26 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 22/11/2022 08:14, Ihor Radchenko wrote:
Max Nikulin writes:

2. `org-sort-list'
5. `org-sort-entries'
`downcase' is used, not proper case folding, so a potential issue

`downcase' is used to determine user input about sorting type.
Not for sorting itself.

See case-func variable. Its initialization depends on the IGNORE-CASE argument. Strings to sort are passed either through `identity' or through `downcase'.

4. `org-set-tags' (tag order), when `org-tags-sort-function' is set to
     "Alphabetical" or "Reverse alphabetical".

IGNORE-CASE argument is not used, perhaps `downcase' is hidden in the code.

I feel like we are slightly miscommunicating here.
I mostly tried to list the uses of libc-sensitive sorting. Not
specifically cases when we try to ignore the case.

The problem is not limited to case-sensitive comparisons. Some systems
may fail to implement specific locales and thus sorting may downgrade to
simple string-lessp.

When case folding is not involved, I consider `string-lessp' as a graceful degradation. Despite locale rules are not applied, strings are mostly sorted. Exceptions exist, but usually order is reasonable.

Completely disregarding IGNORE-CASE argument of `string-collate-lessp' on MacOS (that is not a heavily stripped embedded OS) is a bad surprise for me.

6. Agenda sorting, when alphabetical sorting is involved

`string-lessp' and `downcase' so even more severe locale-related issues
might be expected.

Could you please elaborate?

I admit that `downcase' may be an acceptable workaround since `string-collate-lessp' may not work IGNORE-CASE, but I believe, when available, `string-collate-lessp' should be the preferred option for sorting.

Achieving consistency across Org code requires additional efforts.

Well. Just using `string-lessp' would make things very consistent.
Easily and with no efforts.

With hope that clang will get better Unicode support, I would move in the opposite direction, namely wider usage of `string-collate-lessp'. Just using `string-lessp' means no ignore case sort even where it is available now.

I have an idea of a compatibility wrapper for `string-collate-lessp' with special treatment of ignoring case and bad libc implementation. Apply `downcase' before passing arguments to `string-lessp'. It should provide consistency, best user experience when locales works properly, and graceful degradation otherwise. I hope, it is acceptable for Org even though such trick is undesired for Emacs due to performance reasons.

However I am afraid of compatibility shims after

d3a9c424b 2022-08-16 17:15:27 +0800 Ihor Radchenko: org-encode-time: Refactor into top-level `defmacro'

P.S. I am not motivated enough to build Emacs on Linux using clang to check if locale information will be available. I am almost sure that some locale information is available on MacOS, e.g. at least strcasecmp even if full CLDR can not be easily accessed from C. I do not have a Mac to check state of affairs. For objective-C there is e.g. comareCaseIndependent.

I do not like that Emacs relies on locale support (and timezone as well) in libc. It becomes a problem as soon as more than one locale should be used in simultaneously. I agree that there are enough complications and sometimes locale depends on the document (e.g. #+LANGUAGE:), sometimes specific locale even restricted to a part of a document. It is tricky to handle such cases, but current limitations are too strict (and defective `string-collate-lessp' on MacOS is an example).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]