emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CSV parsing and other issues (Re: LC_NUMERIC)


From: Maxim Nikulin
Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC)
Date: Fri, 4 Jun 2021 23:31:13 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

On 03/06/2021 22:01, Eli Zaretskii wrote:
From: Maxim Nikulin
Date: Thu, 3 Jun 2021 21:44:08 +0700

So locale-aware number formatting would be a great improvement for
Emacs. On the other hand, it should be implemented with great care to
avoid localized numbers in some cases. Maybe locale argument should be
passed to functions that deal with numbers. Formatting of integer
numbers is not enough, floating point numbers should be handled as well.
Parsing numbers formatted accordingly to locale rules should be
addressed too. A function similar to `locale-info' is highly desired to
get properties of locale (e.g. decimal_point from result of localeconv).
Some decision is required whether calc & Co should operate with
localized numbers.

Setting a locale globally in Emacs is a non-starter, for the reasons
that you point out and others.  Text processing in Emacs is generally
separate from the current locale's rules, mainly to have Emacs work
the same in any locale.  So passing a locale argument to functions
that produce output, with the intent to request some behavior to be
tailored to that locale, is the only reasonable way to have this kind
of functionalities in Emacs.  The problem with that, of course, is
that not every supported platform can dynamically change the locale,
let alone do that efficiently.

I do not think it is efficient to require from users to fight with number formatting themselves. Some links from my browser history when I was trying to figure out how to get locale-specific decimal separator in elisp:

https://stackoverflow.com/questions/35661173/how-to-format-table-fields-as-currency-in-org-mode
https://www.emacswiki.org/emacs/AddCommasToNumbers
https://www.reddit.com/r/emacs/comments/61mhyx/creating_a_function_to_add_commasseparators_to/

Do you mean that it is necessary to create new implementation of number formatter specially for Emacs? Something like

https://unicode.org/reports/tr35/tr35-numbers.html
Unicode Locale Data Markup Language (LDML) Part 3: Numbers

Actually it is an almost random link. I do not know which source is currently considered as the best collection of wisdom related to number formatting. Outside of Emacs world, when I needed numbers formatted accordingly to various locales previous time, I was lucky enough to use code similar to the following one and did not care concerning details:

#include <cstdio>
#include <QLocale>
#include <QTextStream>

void test(QTextStream& stream, const char *loc_name) {
        QLocale loc(QString::fromLocal8Bit(loc_name));
        stream << "point: " << loc.decimalPoint()
                << " " << loc.toString(12345.67)
                << " " << loc.toString(1234567890) << "\n";
}
int main(int argc, char *argv[]) {
        QTextStream stream(stdout);
        for (int i = 1; i < argc; ++i) {
                test(stream, argv[i]);
        }
        return 0;
}

./qtloc de_DE en_GB fa_IR
point: , 12.345,7 1.234.567.890
point: . 12,345.7 1,234,567,890
point: ٫ ۱۲٬۳۴۵٫۷ ۱٬۲۳۴٬۵۶۷٬۸۹۰

Surprisingly it works even despite I have not generated de and fa locales.

On linux I see that Emacs is linked with ICU

ldd /usr/bin/emacs | grep -i icu
libicuuc.so.66 => /usr/lib/x86_64-linux-gnu/libicuuc.so.66 (0x00007f457c799000) libicudata.so.66 => /usr/lib/x86_64-linux-gnu/libicudata.so.66 (0x00007f457a61c000)

I am not familiar with ICU API but I expect that it may be utilized
https://github.com/unicode-org/icu/blob/main/icu4c/source/samples/numfmt/capi.c

Do you have a bright idea concerning implementation of parser-formatter for numbers with reasonable efforts?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]