[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gnuastro-commits] master 8837a5a8: Table: date-to-millisec is a new col
From: |
Mohammad Akhlaghi |
Subject: |
[gnuastro-commits] master 8837a5a8: Table: date-to-millisec is a new column arithmetic operator |
Date: |
Fri, 11 Feb 2022 09:33:52 -0500 (EST) |
branch: master
commit 8837a5a8ac40f3c0241499e5c663eba99d3e04ad
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Table: date-to-millisec is a new column arithmetic operator
Until now, when 'date-to-sec' was given a time-string with sub-second
precision, it would add that fractional value to the large integer and save
the result as a 64-bit floating point type.
However, the number of significant digits in the floating point standard is
limited (its in logarithmic scale) and as a result for two different time
strings like below, it would give the same result (even though they are
more than one second apart.
date seconds
---- -------
2018-05-04T04:16:46.7774 1525410945
2018-05-04T04:15:25.8326 1525410945
With this commit, the issue has been solved in two steps:
- 'date-to-sec' operator now only returns 64-bit signed integer numbers
(no more floating points), and will therefore ignore any sub-second
precision.
- 'date-to-millisec' operator has been added to column arithmetic to let
users count (as an integer: thus preserving every digit properly) the
number of milli-seconds since the Unix epoch. Users who need sub-second
precision can use this.
As a result of these changes, the two dates above can now be read in units
of seconds (second column) and milli-seconds with all digits properly
preserved:
date seconds milli-seconds
---- ------- -------------
2018-05-04T04:16:46.7774 1525411006 1525411006777
2018-05-04T04:15:25.8326 1525410925 1525410925832
Notice how the correct seconds of both dates is now different from the
common one that was initially reported (when converting it to 64-bit
floating point).
This bug was reported by Zohreh Ghaffari and Pedram Ashofteh Ardakani.
This fixes bug #61976.
---
NEWS | 13 ++++++++++
bin/table/arithmetic.c | 70 +++++++++++++++++++++++++-------------------------
bin/table/arithmetic.h | 1 +
doc/gnuastro.texi | 21 +++++++++++----
4 files changed, 65 insertions(+), 40 deletions(-)
diff --git a/NEWS b/NEWS
index 54a32466..39122d2c 100644
--- a/NEWS
+++ b/NEWS
@@ -77,6 +77,10 @@ See the end of the file for license conditions.
option, see the changed features below). Like the new '--noblank',
this option can also be called multiple times, so '--noblankend=1
--noblankend=2' is equivalent to '--noblankend=1,2'.
+ - New operator for column arithmetic:
+ - date-to-millisec: convert the formatted date string into the number
+ of milli-seconds (as a 64-bit integer) since Unix-time (00:00:00 of
+ 1970-01-01).
Match:
- k-d tree based matching has been added for finding the matching rows,
@@ -136,6 +140,12 @@ See the end of the file for license conditions.
without a blank magnitude, so you don't need to have the magnitude
column any more!). The old behavior of this option is now available in
the new '--noblankend' option. This point was raised by Samane Raji.
+ - 'date-to-sec' operator now only returns the number of seconds and is
+ always in a 64-bit signed integer format. Until now, if the input had
+ sub-second precision it would return a 64-bit floating point type;
+ however, that resulted in bug #61976 (where the loss of precision due
+ to the floating-point's logarithmic scale for too many significanat
+ digits caused incorrect coversions).
Library:
- gal_fits_tab_read: now takes the number of threads to read the desired
@@ -181,6 +191,9 @@ See the end of the file for license conditions.
Ardakani.
bug #62008: Arithmetic not deleting existing output when the 'makenew' is
used (no FITS file exists in the reverse polish notation).
+ bug #61976: Table arithmetic date-to-sec produces same result for
+ different times (separated by about 1 second). This bug was
+ reported by Zohreh Ghaffari and Pedram Ashofteh Ardakani.
diff --git a/bin/table/arithmetic.c b/bin/table/arithmetic.c
index d85a84f1..f271c779 100644
--- a/bin/table/arithmetic.c
+++ b/bin/table/arithmetic.c
@@ -133,6 +133,7 @@ arithmetic_operator_name(int operator)
case ARITHMETIC_TABLE_OP_IMGTOWCS: out="img-to-wcs"; break;
case ARITHMETIC_TABLE_OP_DATETOSEC: out="date-to-sec"; break;
case ARITHMETIC_TABLE_OP_DISTANCEFLAT: out="distance-flat"; break;
+ case ARITHMETIC_TABLE_OP_DATETOMILLISEC: out="date-to-millisec"; break;
case ARITHMETIC_TABLE_OP_DISTANCEONSPHERE: out="distance-on-sphere";
break;
default:
error(EXIT_FAILURE, 0, "%s: a bug! Please contact us at %s to fix "
@@ -188,6 +189,8 @@ arithmetic_set_operator(struct tableparams *p, char *string,
{ op=ARITHMETIC_TABLE_OP_IMGTOWCS; *num_operands=0; }
else if( !strcmp(string, "date-to-sec"))
{ op=ARITHMETIC_TABLE_OP_DATETOSEC; *num_operands=0; }
+ else if( !strcmp(string, "date-to-millisec"))
+ { op=ARITHMETIC_TABLE_OP_DATETOMILLISEC; *num_operands=0; }
else if( !strcmp(string, "distance-flat"))
{ op=ARITHMETIC_TABLE_OP_DISTANCEFLAT; *num_operands=0; }
else if( !strcmp(string, "distance-on-sphere"))
@@ -649,63 +652,59 @@ arithmetic_datetosec(struct tableparams *p, gal_data_t
**stack,
size_t i, v;
int64_t *iarr;
gal_data_t *out;
+ double subsec=NAN;
char *subsecstr=NULL;
- double *darr, subsec=NAN;
+ char *unit, *name, *comment;
/* Input dataset. */
gal_data_t *in=arithmetic_stack_pop(stack, operator, NULL);
char **strarr=in->array;
- /* Output metadata. */
- char *unit="sec";
- char *name="UNIXSEC";
- char *comment="Unix seconds (from 00:00:00 UTC, 1 January 1970)";
-
/* Make sure the input has a 'string' type. */
if(in->type!=GAL_TYPE_STRING)
error(EXIT_FAILURE, 0, "the operand given to 'date-to-sec' "
"should have a string type, but it is '%s'",
gal_type_name(in->type, 1));
+ /* Output metadata. */
+ switch(operator)
+ {
+ case ARITHMETIC_TABLE_OP_DATETOSEC:
+ unit="sec";
+ name="UNIXSEC";
+ comment="Unix seconds (from 00:00:00 UTC, 1 January 1970)";
+ break;
+ case ARITHMETIC_TABLE_OP_DATETOMILLISEC:
+ unit="msec";
+ name="UNIXMILLISEC";
+ comment="Unix milli-seconds (from 00:00:00 UTC, 1 January 1970)";
+ break;
+ default:
+ error(EXIT_FAILURE, 0, "%s: a bug! Please contact us at %s "
+ "to fix the problem. The operator code %d isn't "
+ "recognized", __func__, PACKAGE_BUGREPORT, operator);
+ }
+
/* Allocate the output dataset. */
out=gal_data_alloc(NULL, GAL_TYPE_INT64, 1, &in->size, NULL, 1,
p->cp.minmapsize, p->cp.quietmmap, name, unit,
comment);
- /* Convert each input string (first try as an integer, assuming no
- sub-second, or floating point precision). */
+ /* Convert each input string into number of seconds. */
iarr=out->array;
for(i=0; i<in->size; ++i)
{
+ /* Read the number of seconds and sub-seconds and write into the
+ output. */
v=gal_fits_key_date_to_seconds(strarr[i], &subsecstr,
&subsec);
- iarr[i] = v==GAL_BLANK_SIZE_T ? GAL_BLANK_INT64 : v;
- if(subsecstr) break;
- }
-
- /* If a sub-second string was present, then save the output as double
- precision floating point. */
- if(subsecstr)
- {
- /* Free the initially allocated output (as an integer). */
- free(subsecstr);
- gal_data_free(out);
-
- /* Allocate a double-precision output. */
- out=gal_data_alloc(NULL, GAL_TYPE_FLOAT64, 1, &in->size, NULL, 1,
- p->cp.minmapsize, p->cp.quietmmap, name, unit,
- comment);
-
- /* Convert the date. */
- darr=out->array;
- for(i=0; i<in->size; ++i)
- {
- subsecstr=NULL;
- v=gal_fits_key_date_to_seconds(strarr[i], &subsecstr,
- &subsec);
- darr[i] = v==GAL_BLANK_SIZE_T ? NAN : v;
- if(subsecstr) { darr[i]+=subsec; free(subsecstr); }
- }
+ iarr[i] = ( v==GAL_BLANK_SIZE_T
+ ? GAL_BLANK_INT64
+ : ( operator == ARITHMETIC_TABLE_OP_DATETOSEC
+ ? v
+ : (isnan(subsec)
+ ? v*1000
+ : v*1000 + (int64_t)(subsec*1000) ) ) );
}
/* Clean up and put the resulting calculation back on the stack. */
@@ -839,6 +838,7 @@ arithmetic_operator_run(struct tableparams *p,
break;
case ARITHMETIC_TABLE_OP_DATETOSEC:
+ case ARITHMETIC_TABLE_OP_DATETOMILLISEC:
arithmetic_datetosec(p, stack, token->operator);
break;
diff --git a/bin/table/arithmetic.h b/bin/table/arithmetic.h
index c9175361..c1306c5b 100644
--- a/bin/table/arithmetic.h
+++ b/bin/table/arithmetic.h
@@ -40,6 +40,7 @@ enum arithmetic_operators
ARITHMETIC_TABLE_OP_IMGTOWCS,
ARITHMETIC_TABLE_OP_DATETOSEC,
ARITHMETIC_TABLE_OP_DISTANCEFLAT,
+ ARITHMETIC_TABLE_OP_DATETOMILLISEC,
ARITHMETIC_TABLE_OP_DISTANCEONSPHERE,
};
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index b1e0400e..463f688f 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -11826,11 +11826,12 @@ See the @option{degree-to-ra} for more on the format
of the output.
@cindex Unix epoch time
@cindex Time, Unix epoch
@cindex Epoch, Unix time
-The popped operand should be a string column in the FITS date format (most
generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}).
-This operator will return the corresponding Unix epoch time (number of seconds
that have passed since 00:00:00 Thursday, January 1st, 1970).
-The returned operand will be named @code{UNIXSEC} (short for Unix-seconds).
-If any of the times have sub-second precision, the numeric datatype of the
column will be 64-bit floating point type.
-Otherwise it will be a 64-bit, signed integer, see @ref{Numeric data types}.
+Return the number of seconds from the Unix epoch time (00:00:00 Thursday,
January 1st, 1970).
+The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}).
+
+The returned operand will be named @code{UNIXSEC} (short for Unix-seconds) and
will be a 64-bit, signed integer, see @ref{Numeric data types}.
+If the input string has sub-second precision, it will be ignored because
floating point numbers cannot accurately store numbers with many significant
digits.
+To preserve sub-second precision, please use @code{date-to-millisec}.
For example, in the example below we are using this operator, in combination
with the @option{--keyvalue} option of the Fits program, to sort your desired
FITS files by observation date (value in the @code{DATE-OBS} keyword in example
below):
@@ -11843,6 +11844,16 @@ $ astfits *.fits --keyvalue=DATE-OBS --colinfoinstdout
\
If you don't need to see the Unix-seconds any more, you can add a
@option{-cFILENAME} (short for @option{--column=FILENAME}) at the end.
For more on @option{--keyvalue}, see @ref{Keyword inspection and manipulation}.
+
+@item date-to-millisec
+Return the number of milli-seconds from the Unix epoch time (00:00:00
Thursday, January 1st, 1970).
+The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}, where @code{.ddd} is the
optional sub-second component).
+
+The returned operand will be named @code{UNIXMILLISEC} (short for Unix
milli-seconds) and will be a 64-bit, signed integer, see @ref{Numeric data
types}.
+The returned value is not a floating point type because for large numbers,
floating point datatypes loose single-digit precision (which is important here).
+
+Other than the units of the output, this operator behaves similarly to
@code{date-to-sec}.
+See the description of that operator for an example.
@end table
@node Operation precedence in Table, Invoking asttable, Column arithmetic,
Table
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gnuastro-commits] master 8837a5a8: Table: date-to-millisec is a new column arithmetic operator,
Mohammad Akhlaghi <=