chicken-janitors
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-janitors] #1322: Locale can influence how CHICKEN reads num


From: Chicken Trac
Subject: Re: [Chicken-janitors] #1322: Locale can influence how CHICKEN reads numbers
Date: Sat, 27 Aug 2016 16:55:26 -0000

#1322: Locale can influence how CHICKEN reads numbers
---------------------------------------+----------------------------
            Reporter:  sjamaan         |      Owner:
                Type:  defect          |     Status:  new
            Priority:  major           |  Milestone:  4.12.0
           Component:  core libraries  |    Version:  4.11.0
          Resolution:                  |   Keywords:  number parsing
Estimated difficulty:  hard            |
---------------------------------------+----------------------------
Description changed by sjamaan:

Old description:

> Because CHICKEN uses the libc `strtol`/`strtoll` and `strtod` functions
> when reading flonums and fixnums, locale settings may influence how
> CHICKEN reads numbers, especially in `decode_literal`.
>
> Hugo Arregui provided the following simple test:
>
> {{{
> ;; Compile this with the -embedded option, since it defines its own
> main()
> (import chicken scheme foreign)
>
> #>
> #include <locale.h>
>
> int main(int argc, char** argv) {
>    setlocale(LC_NUMERIC, "es_AR.UTF-8");
>    CHICKEN_run(C_toplevel);
>    return 0;
> }
> <#
>
> (return-to-host)
> }}}
>
> This fails because the runtime system has several encoded floating-point
> numbers, which will no longer be read correctly.  Also note that `strtod`
> might incorrectly "parse" a floating-point number like `1.002` if it
> happens to be valid in the current locale using thousands separators.
>
> Parsing floating-point numbers in C is going to be pretty damn tricky, so
> we might just try and use `setlocale()` to set the locale to `C` and
> restore it to whatever it was before after doing so.  I have no idea what
> the effects are of calling these functions often in the same program, and
> if there's a performance impact (it might be loading the strings or
> formatting rules for this locale every single time, on the fly, since
> it'll be designed for "normal" programs in which `setlocale()` will be
> called only a handful of times)

New description:

 Because CHICKEN uses the libc `strtol`/`strtoll` and `strtod` functions
 when reading flonums and fixnums, locale settings may influence how
 CHICKEN reads numbers, especially in `decode_literal`.

 Hugo Arregui provided the following simple test:

 {{{
 ;; Compile this with the -embedded option, since it defines its own main()
 (import chicken scheme foreign)

 #>
 #include <locale.h>

 int main(int argc, char** argv) {
    setlocale(LC_NUMERIC, "es_AR.UTF-8");
    CHICKEN_run(C_toplevel);
    return 0;
 }
 <#

 (return-to-host)
 }}}

 This fails because the runtime system has several encoded floating-point
 numbers, which will no longer be read correctly.  Also note that `strtod`
 might incorrectly "parse" a floating-point number like `1.002` if it
 happens to be valid in the current locale using thousands separators.

 Parsing floating-point numbers in C is going to be pretty damn tricky, so
 we might just try and use `setlocale()` to set the locale to `C` and
 restore it to whatever it was before after doing so.  I have no idea what
 the effects are of calling these functions often in the same program, and
 if there's a performance impact (it might be loading the strings or
 formatting rules for this locale every single time, on the fly, since
 it'll be designed for "normal" programs in which `setlocale()` will be
 called only a handful of times)

 See also https://github.com/JuliaLang/julia/pull/5988 for example

--

--
Ticket URL: <https://bugs.call-cc.org/ticket/1322#comment:2>
CHICKEN Scheme <https://www.call-cc.org/>
CHICKEN Scheme is a compiler for the Scheme programming language.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]