gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master 35c14b7: Library (statistics.h): std function


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master 35c14b7: Library (statistics.h): std function accounts for floating point errors
Date: Fri, 16 Jul 2021 08:59:01 -0400 (EDT)

branch: master
commit 35c14b7db28449d3434437f9c26c480e36f585c2
Author: Natáli D. Anzanello <natali.anzanello@ufrgs.br>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Library (statistics.h): std function accounts for floating point errors
    
    Until now, it could occur that the square of the average value was bigger
    than the average of the squares of the values, due to all the elements in
    the distribution having an almost identical value (all elements differing
    by floating point errors). Hence, we would have a negative value under the
    square root of the standard deviation calculation, returning a NaN.
    
    With this commit, an extra check was added to 'gal_statistics_std' and
    'gal_statistics_mean_std' to account for this: when the square of the
    average value is bigger than the average of the square then the returned
    value will be 0.0.
    
    This fixes bug #60923.
---
 NEWS             |  2 ++
 lib/statistics.c | 22 ++++++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 6cab5ea..de5cb51 100644
--- a/NEWS
+++ b/NEWS
@@ -90,6 +90,8 @@ See the end of the file for license conditions.
               reported by Zahra Sharbaf.
   bug #60909: WCS coordinate conversion not working with TPV distortion,
               fixed with the help of Mark Calabretta.
+  bug #60923: Standard deviation calculation giving NaN values due to
+              floating point errors, found and fixed by Natáli Anzanello.
 
 
 
diff --git a/lib/statistics.c b/lib/statistics.c
index 4bc94c7..e0720e9 100644
--- a/lib/statistics.c
+++ b/lib/statistics.c
@@ -214,8 +214,16 @@ gal_statistics_std(gal_data_t *input)
 
     /* More than one element. */
     default:
+
+      /* Find the 's' and 's2' by parsing the data. */
       GAL_TILE_PARSE_OPERATE(input, out, 0, 1, {++n; s += *i; s2 += *i * *i;});
-      o[0]=sqrt( (s2-s*s/n)/n );
+
+      /* Write the standard deviation. If the square of the average value
+         is bigger than the average of the squares of the values, we have a
+         floating-point error (due to all the points having an identical
+         value, within floating point erros). So we should just set the
+         standard deviation to zero. */
+      o[0] = (s*s/n > s2) ? 0 : sqrt( (s2-s*s/n)/n );
       break;
     }
 
@@ -255,9 +263,19 @@ gal_statistics_mean_std(gal_data_t *input)
 
     /* More than one element. */
     default:
+
+      /* Parse the data. */
       GAL_TILE_PARSE_OPERATE(input, out, 0, 1, {++n; s += *i; s2 += *i * *i;});
+
+      /* Write the mean */
       o[0]=s/n;
-      o[1]=sqrt( (s2-s*s/n)/n );
+
+      /* Write the standard deviation. If the square of the average value
+         is bigger than the average of the squares of the values, we have a
+         floating-point error (due to all the points having an identical
+         value, within floating point erros). So we should just set the
+         standard deviation to zero. */
+      o[1] = (s*s/n > s2) ? 0 : sqrt( (s2-s*s/n)/n );
       break;
     }
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]