[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CSV parsing and other issues (Re: LC_NUMERIC)
From: |
Maxim Nikulin |
Subject: |
Re: CSV parsing and other issues (Re: LC_NUMERIC) |
Date: |
Fri, 11 Jun 2021 23:58:24 +0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
On 10/06/2021 23:57, Eli Zaretskii wrote:
>> From: Maxim Nikulin Date: Thu, 10 Jun 2021 23:28:59 +0700
>
> For processing CSV, if there's a need to know whether the
> locale uses the comma as a decimal separator, we could
> indeed extend locale-info. But such an extension is almost
> trivial and doesn't even touch on the significant problems
> in the rest of the discussion.
>
You forgot `setlocale(LC_NUMERIC, "C")', didn't you?
#include <langinfo.h>
#include <locale.h>
#include <stdio.h>
int main() {
setlocale(LC_ALL, "");
printf("%c", *nl_langinfo(RADIXCHAR));
setlocale(LC_NUMERIC, "C");
printf("%c\n", *nl_langinfo(RADIXCHAR));
return 0;
}
Output is ",.". There is nl_langinfo_l(3), but it requires more work.
After parsing of rows to cells, it may be necessary to parse numbers
("2,34" to 2.34). That is why quality of CSV file import is tightly
related to handling of number formats.
>> I was trying to support Boruch that buffer-local variables
>> may be important part of locale context, more precise than
>> global settings,
>
> They are more precise, but they don't support mixed
> languages in the same buffer, something that happens in
> Emacs very frequently.
In some cases I would prefer to have uniform format of numbers and dates
despite alternating language in the buffer, e.g. for my private notes.
> Here's a trivial example:
>
> (insert (downcase (buffer-substring POS1 POS2)))
>
> Contrast with
>
> (insert (downcase "FOO"))
Either `set-text-properties' should be called on "FOO" before passing it
to `downcase' or `locale-downcase' with LOCALE first argument should be
added. Moreover, such `locale-downcase' function may be used to
implement higher level functions working with implicit locales. LOCALE
may assume some hierarchy with user overrides for particular call, text
properties, buffer variables, global settings.
> Yes: what we have already in Emacs. That covers a lot of
> the same Unicode turf that ICU handles, because we import
> and use the same Unicode files and tables.
There are plenty of xml files in cldr-common-39.0.zip
(common/main/*.xml) https://www.unicode.org/Public/cldr/39/ in addition
to Unicode data in Emacs sources. They include rules for number
formatting https://unicode.org/reports/tr35/tr35-numbers.html
Of course, human-style number formatting, currencies, financial style,
etc. may be discarded and implementation may be limited to grouping and
decimal separators (leaving other features to further requests). There
is newlocale(3) function in glibc to obtain minimal subset of
properties. I am not familiar with other platforms.
- Re: CSV parsing and other issues (Re: LC_NUMERIC), (continued)
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Boruch Baum, 2021/06/10
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/10
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Boruch Baum, 2021/06/10
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Boruch Baum, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Filipp Gunbin, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Filipp Gunbin, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC),
Maxim Nikulin <=
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/11
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/14
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/14
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/16
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Eli Zaretskii, 2021/06/16
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Stefan Monnier, 2021/06/10
- Re: CSV parsing and other issues (Re: LC_NUMERIC), Maxim Nikulin, 2021/06/12