[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gnuastro-commits] master 40c9dc8 1/2: CRLF line terminated ASCII files
From: |
Mohammad Akhlaghi |
Subject: |
[gnuastro-commits] master 40c9dc8 1/2: CRLF line terminated ASCII files acceptable for reading |
Date: |
Tue, 22 Aug 2017 07:56:45 -0400 (EDT) |
branch: master
commit 40c9dc8409b5a20ab284bbdbdb91715f3737537c
Author: Mohammad Akhlaghi <address@hidden>
Commit: Mohammad Akhlaghi <address@hidden>
CRLF line terminated ASCII files acceptable for reading
Some operating systems like Microsoft Windows, terminate their lines with a
carriage return character and a new-line character (two characters). While
Unix-like operating systems just use a single new-line character. So until
now, if a text ASCII file was made in Windows, it couldn't be read.
With this commit, a check is added to also check for a carriage return
character at the end of a line. This enables programs taking ASCII text
files as input to read such files even if they were made in operating
systems like Microsoft Windows.
---
NEWS | 5 +++++
doc/gnuastro.texi | 8 ++++++++
lib/txt.c | 25 +++++++++++++++++--------
3 files changed, 30 insertions(+), 8 deletions(-)
diff --git a/NEWS b/NEWS
index 76f1563..657325b 100644
--- a/NEWS
+++ b/NEWS
@@ -31,6 +31,11 @@ GNU Astronomy Utilities NEWS -*-
outline -*-
option, the `--mcol' values of the catalog will be interpretted as total
brightness (sum of pixel values), not magnitude.
+ Library: Functions that read data from an ASCII text file
+ (`gal_txt_table_info', `gal_txt_table_read', `gal_txt_image_read') now
+ also operate on files with CRLF line terminators (for example text files
+ created in MS Windows).
+
** Removed features
** Changed features
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index bca8b1c..c420016 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -19827,6 +19827,14 @@ editor or even on the command-line. Therefore the
functions in this section
are defined to simplify reading from and writing to plain text
files.
+Lines are one of the most basic buiding blocks (delimiters) of a text
+file. Some operating systems like Microsoft Windows, terminate their ASCII
+text lines with a carriage return character and a new-line character (two
+characters, also known as CRLF line terminators). While Unix-like operating
+systems just use a single new-line character. The functions below that read
+an ASCII text file are able to identify lines with both kinds of line
+terminators.
+
Gnuastro defines a simple format for metadata of table columns in a plain
text file that is discussed in @ref{Gnuastro text table format}. The
functions to get information from, read from and write to plain text files
diff --git a/lib/txt.c b/lib/txt.c
index 67dcd03..32f9578 100644
--- a/lib/txt.c
+++ b/lib/txt.c
@@ -296,11 +296,19 @@ txt_info_from_first_row(char *line, gal_data_t **datall,
int format)
size_t n=0, maxcnum=0, numtokens;
char *token, *end=line+strlen(line);
- /* Remove the new line character from the end of the line. If the last
- column is a string, and the given length is larger than the available
- space on the line, we don't want to have the line's new-line
- character. Its better for it to actually be shorter than the space. */
- *(end-1)='\0';
+ /* Remove the line termination character(s) from the end of the line. In
+ Unix, the line terminator is just the new-line character, however, in
+ some operating systems (like MS Windows), it is two characters:
+ carriage return and new-line. To be able to deal with both, we will be
+ checking the second last character first, the ASCII code for carriage
+ return is 13.
+
+ If the last column is a string, and the given length is larger than
+ the available space on the line, we don't want to have the line's
+ new-line character. Its better for it to actually be shorter than the
+ space. */
+ if( *(end-2)==13 ) *(end-2)='\0';
+ else *(end-1)='\0';
/* Get the maximum number of columns read from the comment
information. */
@@ -768,8 +776,9 @@ txt_fill(char *line, char **tokens, size_t maxcolnum,
gal_data_t *info,
int notenoughcols=0;
char *end=line+strlen(line);
- /* See explanations in `txt_info_from_row'. */
- *(end-1)='\0';
+ /* See explanations in `txt_info_from_first_row'. */
+ if( *(end-2)==13 ) *(end-2)='\0';
+ else *(end-1)='\0';
/* Start parsing the line. Note that `n' and `maxcolnum' start from
one. */
@@ -781,7 +790,7 @@ txt_fill(char *line, char **tokens, size_t maxcolnum,
gal_data_t *info,
if(n>maxcolnum) break;
/* Set the pointer to the start of this token/column. See
- explanations in `txt_info_from_row'. */
+ explanations in `txt_info_from_first_row'. */
if( info[n-1].type == GAL_TYPE_STRING )
{
/* Remove any delimiters and stop at the first non-delimiter. If