gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master bfdbf04b: Table: --equal, --notequal and --col


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master bfdbf04b: Table: --equal, --notequal and --colmetadata can contain comma
Date: Wed, 9 Feb 2022 13:00:11 -0500 (EST)

branch: master
commit bfdbf04b8fbe48065d24fb0deb65bcd17824914e
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>

    Table: --equal, --notequal and --colmetadata can contain comma
    
    Until now, the column values/information given to these options couldn't
    contain a comma. This was because a comma was used as a delimiter between
    the values!
    
    With this commit, by adding a '\' before a comma, users can use a comma in
    any of the values (not names!) of the options.
    
    This task was suggested by Zohreh Ghaffari.
---
 NEWS              |  6 ++++++
 doc/gnuastro.texi | 30 ++++++++++++++++++++++++++----
 lib/options.c     | 48 +++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 73 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index ec182c17..0d469609 100644
--- a/NEWS
+++ b/NEWS
@@ -46,6 +46,12 @@ See the end of the file for license conditions.
      by Tamara Civera Lorenzo.
 
   Table:
+   - Options that accept strings ('--colmetadata', '--equal' and
+     '--notequal') can now accept a comma within the string: to avoid
+     confusing the comma with a separator of values, you should put a '\'
+     before it. For example '--equal=AB,cd\,ef' will select all rows where
+     the 'AB' column has a value of 'cd,ef'. This task was suggested by
+     Zohreh Ghaffari.
    --catrowfile: File to concatenate (i.e., add or append) rows into the
      main input table. With this option, you can add the rows of another
      table into the final output. This option can be called multiple times,
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index 1f956c5e..a8ecb5da 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -12255,6 +12255,13 @@ For the precedence of this operation in relation to 
others, see @ref{Operation p
 For example @option{--equal=ID,5,6,8} will only print the rows that have a 
value of 5, 6, or 8 in the @code{ID} column.
 This option can also be called multiple times, so @option{--equal=ID,4,5 
--equal=ID,6,7} has the same effect as @option{--equal=4,5,6,7}.
 
+@cartouche
+@noindent
+@strong{Equality and floating point numbers:} Floating point numbers are only 
approximate values (see @ref{Numeric data types}).
+In this context, their equality depends on how the the input table was 
originally stored (as a plain text table or as an ASCII/binary FITS table).
+If you want to select floating point numbers, it is strongly recommended to 
use the @option{--range} option and set a very small interval around your 
desired number, don't use @option{--equal} or @option{--notequal}.
+@end cartouche
+
 The @option{--equal} and @option{--notequal} options also work when the given 
column has a string type.
 In this case the given value to the option will also be parsed as a string, 
not as a number.
 When dealing with string columns, be careful with trailing white space 
characters (the actual value maybe adjusted to the right, left, or center of 
the column's width).
@@ -12263,9 +12270,13 @@ For example @code{--equal=NAME,"  myname "}.
 
 @cartouche
 @noindent
-@strong{Equality and floating point numbers:} Floating point numbers are only 
approximate values (see @ref{Numeric data types}).
-In this context, their equality depends on how the the input table was 
originally stored (as a plain text table or as an ASCII/binary FITS table).
-If you want to select floating point numbers, it is strongly recommended to 
use the @option{--range} option and set a very small interval around your 
desired number, don't use @option{--equal} or @option{--notequal}.
+@strong{Strings with a comma (,):} When your desired column values contain a 
comma, you need to put a `@code{\}' before the internal comma (within the 
value).
+Otherwise, the comma will be interpreted as a delimiter between multiple 
values, and anything after it will be interpretted as a separate string.
+For example, assume column @code{AB} of your @file{table.fits} contains this 
value: `@code{cd,ef}' in your desired rows.
+To extract those rows, you should use the command below:
+@example
+$ asttable table.fits --equal=AB,cd\,ef
+@end example
 @end cartouche
 
 @item -n STR,INT/FLT,...
@@ -12395,8 +12406,19 @@ After the to-be-updated column is identified, at least 
one other string should b
 The first string after the original name will the the selected column's new 
name.
 The next (optional) string will be the selected column's unit and the third 
(optional) will be its comments.
 If the two optional strings aren't given, the original column's units or 
comments will remain unchanged.
+
+If any of the values contains a comma, you should place a `@code{\}' before 
the comma to avoid it getting confused with a delimiter.
+For example see the command below for a column description that contains a 
comma:
+
+@example
+$ asttable table.fits \
+           --colmetadata=NAME,UNIT,"Comments\, with a comma"
+@end example
+
+Generally, since the comma is commonly used as a delimiter in many scenarios, 
to avoid complicating your future analysis with the table, it is best to avoid 
using a comma in the column name and units.
+
 Some examples of this option are available in the tutorials, in particular 
@ref{Working with catalogs estimating colors}.
-Here are some more specific examples
+Here are some more specific examples:
 
 @table @option
 
diff --git a/lib/options.c b/lib/options.c
index 1bb07205..4d3e4a3d 100644
--- a/lib/options.c
+++ b/lib/options.c
@@ -789,7 +789,7 @@ gal_options_parse_list_of_numbers(char *string, char 
*filename, size_t lineno)
           ++c;
           break;
 
-        /* Comma marks the transition to the next number. */
+        /* Comma or Colon mark the transition to the next number. */
         case ',':
         case ':':
           if(isnan(numerator))
@@ -899,15 +899,19 @@ gal_options_parse_list_of_numbers(char *string, char 
*filename, size_t lineno)
 
 
 
-
-
+/* Replacement characters for commented comma (ASCII code 14 for "Shift
+   out") or colon (ASCII code 15 for "Shift in"). These are chosen as
+   non-printable ASCII characters, that user's will not be typing. */
+#define OPTIONS_COMMENTED_COMMA 14
+#define OPTIONS_COMMENTED_COLON 15
 gal_data_t *
 gal_options_parse_list_of_strings(char *string, char *filename, size_t lineno)
 {
   size_t num;
   gal_data_t *out;
+  int needscorrection;
   gal_list_str_t *list=NULL, *tll;
-  char *cp, *token, **strarr, delimiters[]=",:";
+  char *c, *d, *cp, *token, **strarr, delimiters[]=",:";
 
   /* The nature of the arrays/numbers read here is very small, so since
      'p->cp.minmapsize' might not have been read yet, we will set it to -1
@@ -918,8 +922,27 @@ gal_options_parse_list_of_strings(char *string, char 
*filename, size_t lineno)
   /* If we have an empty string, just return NULL. */
   if(string==NULL || *string=='\0') return NULL;
 
-  /* Make a copy of the input string, and save the tokens */
+  /* Make a copy of the input string, remove all commented delimiters
+     (those with a preceding '\'). */
   gal_checkset_allocate_copy(string, &cp);
+  for(c=cp; *c!='\0'; c++)
+    if(*c=='\\' && c[1]!='\0')
+      {
+        /* If the next character (after the '\') is a delimiter, we need to
+           replace it with a non-delimiter (and not-typed!) character and
+           shift the whole string back by one character to simplify future
+           steps. */
+        needscorrection=0;
+        switch(c[1])
+          {
+          case ',': *c=OPTIONS_COMMENTED_COMMA; needscorrection=1; break;
+          case ':': *c=OPTIONS_COMMENTED_COLON; needscorrection=1; break;
+          }
+        if(needscorrection)
+          { for(d=c+2; *d!='\0'; ++d) {*(d-1)=*d;} *(d-1)='\0'; }
+      }
+
+  /* Make a copy of the input string, and save the tokens */
   token=strtok(cp, delimiters);
   gal_list_str_add(&list, token, 1);
   while(token!=NULL)
@@ -929,7 +952,6 @@ gal_options_parse_list_of_strings(char *string, char 
*filename, size_t lineno)
         gal_list_str_add(&list, token, 1);
     }
 
-
   /* Allocate the output dataset (array containing all the given
      strings). */
   num=gal_list_str_number(list);
@@ -939,7 +961,19 @@ gal_options_parse_list_of_strings(char *string, char 
*filename, size_t lineno)
   /* Fill the output dataset. */
   strarr=out->array;
   for(tll=list;tll!=NULL;tll=tll->next)
-    strarr[--num]=tll->v;
+    {
+      /* If we had commented delimiters, we need to set them back to their
+         original/typed forms. */
+      for(c=tll->v; *c!='\0'; ++c)
+        switch(*c)
+          {
+          case OPTIONS_COMMENTED_COMMA: *c=','; break;
+          case OPTIONS_COMMENTED_COLON: *c=':'; break;
+          }
+
+      /* Put the pointer of the string in the output array. */
+      strarr[--num]=tll->v;
+    }
 
   /* Clean up and return. Note that we don't want to free the values in the
      list, the elements in 'out->array' point to them and will later use



reply via email to

[Prev in Thread] Current Thread [Next in Thread]