gnuastro-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnuastro-commits] master 40c9dc8 1/2: CRLF line terminated ASCII files


From: Mohammad Akhlaghi
Subject: [gnuastro-commits] master 40c9dc8 1/2: CRLF line terminated ASCII files acceptable for reading
Date: Tue, 22 Aug 2017 07:56:45 -0400 (EDT)

branch: master
commit 40c9dc8409b5a20ab284bbdbdb91715f3737537c
Author: Mohammad Akhlaghi <address@hidden>
Commit: Mohammad Akhlaghi <address@hidden>

    CRLF line terminated ASCII files acceptable for reading
    
    Some operating systems like Microsoft Windows, terminate their lines with a
    carriage return character and a new-line character (two characters). While
    Unix-like operating systems just use a single new-line character. So until
    now, if a text ASCII file was made in Windows, it couldn't be read.
    
    With this commit, a check is added to also check for a carriage return
    character at the end of a line. This enables programs taking ASCII text
    files as input to read such files even if they were made in operating
    systems like Microsoft Windows.
---
 NEWS              |  5 +++++
 doc/gnuastro.texi |  8 ++++++++
 lib/txt.c         | 25 +++++++++++++++++--------
 3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 76f1563..657325b 100644
--- a/NEWS
+++ b/NEWS
@@ -31,6 +31,11 @@ GNU Astronomy Utilities NEWS                          -*- 
outline -*-
   option, the `--mcol' values of the catalog will be interpretted as total
   brightness (sum of pixel values), not magnitude.
 
+  Library: Functions that read data from an ASCII text file
+  (`gal_txt_table_info', `gal_txt_table_read', `gal_txt_image_read') now
+  also operate on files with CRLF line terminators (for example text files
+  created in MS Windows).
+
 ** Removed features
 
 ** Changed features
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index bca8b1c..c420016 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -19827,6 +19827,14 @@ editor or even on the command-line. Therefore the 
functions in this section
 are defined to simplify reading from and writing to plain text
 files.
 
+Lines are one of the most basic buiding blocks (delimiters) of a text
+file. Some operating systems like Microsoft Windows, terminate their ASCII
+text lines with a carriage return character and a new-line character (two
+characters, also known as CRLF line terminators). While Unix-like operating
+systems just use a single new-line character. The functions below that read
+an ASCII text file are able to identify lines with both kinds of line
+terminators.
+
 Gnuastro defines a simple format for metadata of table columns in a plain
 text file that is discussed in @ref{Gnuastro text table format}. The
 functions to get information from, read from and write to plain text files
diff --git a/lib/txt.c b/lib/txt.c
index 67dcd03..32f9578 100644
--- a/lib/txt.c
+++ b/lib/txt.c
@@ -296,11 +296,19 @@ txt_info_from_first_row(char *line, gal_data_t **datall, 
int format)
   size_t n=0, maxcnum=0, numtokens;
   char *token, *end=line+strlen(line);
 
-  /* Remove the new line character from the end of the line. If the last
-     column is a string, and the given length is larger than the available
-     space on the line, we don't want to have the line's new-line
-     character. Its better for it to actually be shorter than the space. */
-  *(end-1)='\0';
+  /* Remove the line termination character(s) from the end of the line. In
+     Unix, the line terminator is just the new-line character, however, in
+     some operating systems (like MS Windows), it is two characters:
+     carriage return and new-line. To be able to deal with both, we will be
+     checking the second last character first, the ASCII code for carriage
+     return is 13.
+
+     If the last column is a string, and the given length is larger than
+     the available space on the line, we don't want to have the line's
+     new-line character. Its better for it to actually be shorter than the
+     space. */
+  if( *(end-2)==13 ) *(end-2)='\0';
+  else               *(end-1)='\0';
 
   /* Get the maximum number of columns read from the comment
      information. */
@@ -768,8 +776,9 @@ txt_fill(char *line, char **tokens, size_t maxcolnum, 
gal_data_t *info,
   int notenoughcols=0;
   char *end=line+strlen(line);
 
-  /* See explanations in `txt_info_from_row'. */
-  *(end-1)='\0';
+  /* See explanations in `txt_info_from_first_row'. */
+  if( *(end-2)==13 ) *(end-2)='\0';
+  else               *(end-1)='\0';
 
   /* Start parsing the line. Note that `n' and `maxcolnum' start from
      one. */
@@ -781,7 +790,7 @@ txt_fill(char *line, char **tokens, size_t maxcolnum, 
gal_data_t *info,
       if(n>maxcolnum) break;
 
       /* Set the pointer to the start of this token/column. See
-         explanations in `txt_info_from_row'. */
+         explanations in `txt_info_from_first_row'. */
       if( info[n-1].type == GAL_TYPE_STRING )
         {
           /* Remove any delimiters and stop at the first non-delimiter. If



reply via email to

[Prev in Thread] Current Thread [Next in Thread]