chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] [PATCH] Improve Chicken's handling of numerical syntax


From: Peter Bex
Subject: [Chicken-hackers] [PATCH] Improve Chicken's handling of numerical syntax
Date: Mon, 12 Sep 2011 23:08:03 +0200
User-agent: Mutt/1.4.2.3i

Hi all,

As some of you know I've been doing some work on the "numbers" egg which
culminated in a horrific torture test for numerical syntax that was
sent to the R7RS discussion list in an attempt to get rid of the ugly '#'
"padding" syntax in numbers which it inherited from R5RS (which seems to
have been successful; according to John Cowan people voted down this syntax).

Chicken core did pretty badly on this test (55 errors with 4.7.0), and
I've now made a patch to make Chicken pass these tests.  This patch also
integrates the test into Chicken's testsuite.

While hacking on it I found some code for OpenBSD and Windows to handle
"nan" and "inf" syntaxes.  If I understand correctly these systems do not
handle these in their libc's strtod() function.  I haven't been able to
test, but I believe this may actually cause compatibility problems
between Chickens running on different OSes; some Chickens accept
"+NaN" as valid input, while those running on Windows or OpenBSD
(probably?) don't.  Depending on libc, Chicken might even serialize these
values in an incompatible way such that others don't accept it.
This patch should get rid of these basic differences.

To do this, there's an ugly bit of code in the attached patch which adds
a check in convert_string_to_number() for the strings "+nan.0" "-nan.0",
"+inf.0" and "-inf.0", case insensitively.  This conforms to R6RS, and
most likely will end up as official R7RS syntax as well (there's still
a proposal up in the air to make infinity represented by "1/0").

I've commented out some checks to make it still allow prefixes of these
strings and pass on the string to strol()/strtod() unchecked in an
attempt to keep backwards compatibility for a while.  These checks have
deprecation comments around them and the corresponding strictness tests
from the numbers-string-conversion-tests.scm are commented out similarly.
I've also added it to the NEWS file.  If you think this is overkill, we
can easily drop these to skip the deprecation step.

Other than that, I've tried to keep the diff as small as possible so it
can be more easily understood; C_a_i_string_to_number() has one extra
pointer "rptr" which doubles as a flag to keep track of whether it's
seen a rational '/', so it can detect and reject invalid strings like
"1.0/2.0" and as a marker to start parsing the denominator.  Maybe this
is overly "clever", I'm not sure.  It seemed like a good idea at the time.

There's also a flag to keep track of exponents, since "1/2e5" is invalid
as well.  I've added some code to allow all the weird exponent markers
supported by R5RS, squashing all of them to 'e' which strtod() groks.
The hashes are now checked not to follow an exponent and may not be in
place of the first digit so that "#1" isn't allowed by string->number
(the reader didn't allow it in the first place).

Once this patch is in, we probably should add some read/write invariance
checks to the tests as well.  I know currently we don't handle
everything completely consistently (example: "#x#e#x1" is accepted
and interpreted as 1 by the reader, but rejected by string->number,
which is an open ticket #674), but I think that's something for later.
Let's get the basic number parsing correct first!

Cheers,
Peter
-- 
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
                                                        -- Donald Knuth

Attachment: number-syntax.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]