bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: index sorting in texi2any in C issue with spaces


From: Gavin Smith
Subject: Re: index sorting in texi2any in C issue with spaces
Date: Sun, 4 Feb 2024 20:49:55 +0000

On Sun, Feb 04, 2024 at 08:38:45PM +0100, Patrice Dumas wrote:
> Thanks.  This is very confusing to me, then, as it is not told that way
> in perllocale, especially the section: 
> https://perldoc.perl.org/perllocale#Category-LC_COLLATE%3A-Collation%3A-Text-Comparisons-and-Sorting
> There is more information in the end of the page that may correspond
> better to the perlop information.  Not important at all anyway
> since we agree that using the user locale is not a good idea in any case.

Yes, it appears to say the opposite:

       Perl uses the platform's C library collation functions "strcoll()" and
       "strxfrm()".  That means you get whatever they give.  On some
       platforms, these functions work well on UTF-8 locales, giving a
       reasonable default collation for the code points that are important in
       that locale.  (And if they aren't working well, the problem may only be
       that the locale definition is deficient, so can be fixed by using a
       better definition file.  Unicode's definitions (see "Freely available
       locale definitions") provide reasonable UTF-8 locale collation
       definitions.)  Starting in Perl v5.26, Perl's use of these functions
       has been made more seamless.  This may be sufficient for your needs.
       For more control, and to make sure strings containing any code point
       (not just the ones important in the locale) collate properly, the
       Unicode::Collate module is suggested.

So COLLATE_LOCALE (if we go with that naming) could potentially be
implemented in Perl as well, if we are able to temporarily switch
the locale.  Speed could be an issue, though.  (Although the documentation
says the result of strxfrm is cached, so maybe not.)

I guess that the other documentation is either out of date, or they
were mandating Unicode::Collate as more portable than relying on the
platform's C library.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]