|
From: | Tim Daly |
Subject: | [Axiom-developer] Re: [sage-devel] Randomised testing against Mathematica |
Date: | Wed, 03 Mar 2010 09:40:36 -0500 |
User-agent: | Thunderbird 2.0.0.21 (Windows/20090302) |
There are two test suites with validated results at http://axiom-developer.org/axiom-website/CATS/ The CATS (Computer Algebra Test Suite) effort targets the development of known-good answers that get run against several systems. These "end result" suites test large portions of the system. As they are tested against published results they can be used by all systems. The integration suite found several bugs in the published results which are noted in the suite. It also found a bug introduced by an improper patch to Axiom. It would be generally useful if Sage developed known-good test suites in other areas, say infinite sequences and series. Perhaps such a suite would make a good GSOC effort with several moderators from different systems. I have done some more work toward a trigonometric test suite. So far I have found that Mathematica and Maxima tend to agree on branch cuts and Axiom and Maple tend to agree on branch cuts. The choice is arbitrary but it affects answers. I am having an internal debate about whether to choose MMA/Maxima compatible answers just to "regularize" the expected results users will see. Standardized test suites give our users confidence that we are generating known-good results for some (small) range of expected inputs. An academic-based effort (which Axiom is not) could approach NIST for funding an effort to develop such suites. NIST has a website (http://dlmf.nist.gov/) Digital Library of Mathematical Functions. I proposed developing Computer Algebra test suites for their website but NIST does not fund independent open source projects. Sage, however, could probably get continuous funding to develop such suites which would benefit all of the existing CAS efforts. NSF might also be convinced since such test suites raise the level of expected quality of answers without directly competing against commercial efforts. I'd like to see a CAS testing research lab that published standardized answers to a lot of things we all end up debating, such as branch cuts, sqrt-of-squares, foo^0, etc. Tim Daly Dr. David Kirkby wrote:
Joshua Herman wrote:Is there a mathematica test suite we could adapt or a standardized set of tests we could use? Maybe we could take the 100 most often used functions and make a test suite?I'm not aware of one. A Google found very little of any real use.I'm sure Wolfram Research have such test suites internally, but they are not public. There is discussion of how they have an internal version of Mathematica which runs very slowly, but tests things in greater detail.http://reference.wolfram.com/mathematica/tutorial/TestingAndVerification.htmlOf course, comparing 100 things is useful, but comparing millions of them in the way I propose would more likely show up problems.I think we are all aware that it is best to test on the hardware you are using to be as confident as possible that the results are right.Of course, Wolfram Research could supply a test suite to check Mathematica on an end user's computer, but they do not do that. They could even encrypt it, so users did not know what was wrong, but could at least alert Wolfram Research.I'm aware of one bug in Mathematica that only affected old/slower SPARC machines if Solaris was updated to Solaris 10. I suspect it would have affected newer machines too, had they been heavily loaded. (If I was sufficiently motivated, I would probably prove that, but I'm not, so my hypothesis is unproven).It did not produce incorrect results, but pegged the CPU at 100% forever if you computed something as simple as 1+1.) It was amazing how that was solved between myself, Casper Dik a kernel engineer at Sun and various other people on the Internet. It was Casper who finally nailed the problem, after I posted the output of lsof, he could see what Mathematica was doing.I've got a collection of a few Mathematica bugs, mainly affecting only Solaris, although one affected at least one Linux distribution too.http://www.g8wrb.org/mathematica/One thing I know Mathematica does do, which Sage could do, is to automatically generate bug report if it finds a problem. At the most primitive level, that code might beif (x < 0) function_less() else if (x == 0) function_equal() else if (x > 0) function_greater() else function_error()If the error is generated, a URL is given, which you click and can send a bug report to them. It lists the name of the file and line number which generated the error. That's something that could be done in Sage and might catch some bugs.Dave---- LOOK ITS A SIGNATURE CLICK IF YOU DARE--- http://www.google.com/profiles/zitterbewegungOn Wed, Mar 3, 2010 at 12:04 AM, David Kirkby <address@hidden> wrote:Has anyone ever considered randomised testing of Sage against Mathematica?As long as the result is either a) True or False b) An integer then comparison should be very easy. As a dead simple example, 1) Generate a large random number n. 2) Use is_prime(n) in Sage to determine if n is prime or composite. 3) Use PrimeQ[n] in Mathematica to see if n is prime or composite. 4) If Sage and Mathematica disagree, write it to a log file. Something a bit more complex.1) Generating random equation f(x) - something that one could integrate.2) Generate generate random upper and lower limits, 'a' and 'b'3) Perform a numerical integration of f(x) between between 'a' and 'b' in Sage4) Perform a numerical integration of f(x) between between 'a' and 'b' in Mathematica 5) Compare the outputs of the Sage and Mathematica A floating point number, would be more difficult to compare, as one would need to consider what is a reasonable level of difference. Comparing symbolic results directly would be a much more difficult task, and probably impossible without a huge effort, since you can often write an equation in several different ways which are equal, but a computer program could not easily be programmed to determine if they are equal. One could potentially let a computer crunch away all the time, looking for differences. Then when they are found, a human would had to investigate why the difference occurs. One could then add a trac item for "Mathematica bugs" There was once a push for a public list of Mathematica bugs. I got involved a bit with that, but it died a death and I became more interested in Sage. Some of you may know of Vladimir Bondarenko, who is a strange character who regularly used to publish Mathematica and Maple bugs he had found. In some discussions I've had with him, he was of the opinion that Wolfram Research took bug reports more seriously than Maplesoft. I've never worked out what technique he uses, but I believe is doing some randomised testing, though it is more sophisticated that what I'm suggesting above. There must be a big range of problem types where this is practical - and a much larger range where it is not. You could at the same also compare the time taken to execute the operation to find areas where Sage is much faster or slower than Mathematica. Dave -- To post to this group, send an email to address@hiddenTo unsubscribe from this group, send an email to address@hidden For more options, visit this group at http://groups.google.com/group/sage-develURL: http://www.sagemath.org
[Prev in Thread] | Current Thread | [Next in Thread] |