[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: new function: textscan.m
From: |
Ben Abbott |
Subject: |
Re: new function: textscan.m |
Date: |
Sun, 24 Oct 2010 01:05:18 +0800 |
On Oct 23, 2010, at 5:21 PM, Liam Groener wrote:
> On Oct 22, 2010, at 10:34 PM, Ben Abbott wrote:
>
>> On Oct 23, 2010, at 10:14 AM, Liam Groener wrote:
>>
>>> On Oct 22, 2010, at 6:46 PM, Ben Abbott wrote:
>>>
>>>> On Oct 23, 2010, at 8:53 AM, John W. Eaton wrote:
>>>>
>>>>> On 23-Oct-2010, Ben Abbott wrote:
>>>>>
>>>>> | I've made an attempt to implement the missing function textscan.m
>>>>> |
>>>>> | If there are no suggestions for improvement, I'll commit.
>>>>>
>>>>> + if (nargin > 2 && isnumeric (varargin{1}))
>>>>> + N = varargin{1};
>>>>>
>>>>> I think it would help to quickly understand what N is if you used
>>>>> nlines or similar instead of N. Also, we generally try to avoid
>>>>> uppercase variable names in Octave.
>>>>>
>>>>> + if ((! strcmp (class (fid), "double") || fid < 0) && ! ischar (fid))
>>>>> + error ("textscan: first input argument must be a valid file id, or
>>>>> string.");
>>>>> + endif
>>>>> +
>>>>> + if (! ischar (formatstr) && ! isempty (formatstr))
>>>>> + error ("textscan: second input must be a format specification.");
>>>>> + endif
>>>>>
>>>>> Maybe I'm just slow, but I have a harder time understanding negative
>>>>> conditions like the ones above. Instead of checking the conditions
>>>>> that lead to errors, I find it simpler to write and easier to
>>>>> understand code later if I test the conditions for success instead.
>>>>> For example, instead of the above, I would write something like
>>>>>
>>>>> if (isa (fid, "double") && fid > 0 || ischar (fid))
>>>>> if (ischar (formatstr) || isempty (formatstr))
>>>>> ## ... code to do the real work here ...
>>>>> else
>>>>> error ("textscan: second input must be a format specification");
>>>>> endif
>>>>> endif
>>>>> else
>>>>> error ("textscan: expecting first argument to be a file id or character
>>>>> string");
>>>>> endif
>>>>>
>>>>> Is that condition on formatstr correct? Is it OK for it to be empty
>>>>> if it is not a character string?
>>>>>
>>>>> Note also that isa is probably better than class+strcmp. But what
>>>>> happens if fid is a matrix? Should we check for that? Should we
>>>>> maybe have a is_valid_file_id function? Maybe that would also be
>>>>> useful in other places too.
>>>>>
>>>>> jwe
>>>>
>>> Hi Ben,
>>>
>>> I thought that, in Matlab, N is the number of times that the format string
>>> is repeated (as in textread), not the number of lines to be read. Did you
>>> intend to make this change? (Or am I all wet?)
>>> Liam
>>
>> I have never used texscan before this week. It would be wise to be skeptical
>> of my understanding for how Matlab's version works.
>>
>> Can you provide me an example that illustrates the difference between
>> repeating for format string, and reading the number of lines?
>>
>> Ben
>>
> Well, I haven't used textscan either. (I don't have Matlab.) I got my
> impressions of how textscan works from a Matlab book. I modified the example
> script I sent you the other day as follows:
>
> B = [30 40 60 70 80];
> fid = fopen('myoutput','w');
> fprintf(fid,'%g miles %g kilometers\n',[B;8*B/5]);
> fclose(fid);
>
> [a,b,c,d] = textread('myoutput','%f %s',2)
>
> fid=fopen('myoutput','r');
> C = textscan(fid,'%f %s',2);
> C{1}
> C{2}
> C{3}
> C{4}
> fclose(fid);
>
> From my understanding, both the textread and textscan parts of this script
> should give more or less the same output. Note that, at least the textread
> part, reads all five lines of the file, with four values per line, with N=2.
>
> Liam G.
I found an example from the Mathworks website that does not work for the
current implementation.
fid = fopen ('grades.txt', 'w');
fprintf (fid, '%s\n', 'Student_ID | Test1 | Test2 | Test3');
fprintf (fid, '%s\n', ' 1 91.5 89.2 77.3');
fprintf (fid, '%s\n', ' 2 88.0 67.8 91.0');
fprintf (fid, '%s\n', ' 3 76.3 78.1 92.5');
fprintf (fid, '%s\n', ' 4 96.4 81.2 84.6');
fclose (fid);
fid = fopen ('grades.txt');
C_text = textscan (fid, '%s', 4, 'delimiter', '|');
C_data0 = textscan (fid, '%d %f %f %f');
frewind (fid);
C_text = textscan (fid, '%s', 4, 'delimiter', '|');
C_data1 = textscan (fid, '%d %f %f %f', 'CollectOutput', 1);
fclose (fid);
The proper result is ...
C_text = {'Student_ID', 'Test1', 'Test2' 'Test3'};
C_data0 = {[1;2;3;4], [91.5;88.0;76.3;96.4], [89.2;67.8;78.1;81.2],
[77.3;91.0;92.5;84.6]};
C_data1 = {[1;2;3;4], [[91.5;88.0;76.3;96.4], [89.2;67.8;78.1;81.2],
[77.3;91.0;92.5;84.6]]};
I'll have to give some thought on how to handle this. If anyone has some
advice, it would be appreciated.
Ben
- Re: new function: textscan.m, (continued)
- Re: new function: textscan.m, John W. Eaton, 2010/10/22
- Re: new function: textscan.m, Michael D Godfrey, 2010/10/22
- Re: new function: textscan.m, Ben Abbott, 2010/10/23
- Re: new function: textscan.m, John W. Eaton, 2010/10/23
- Re: new function: textscan.m, Michael D Godfrey, 2010/10/23
- Re: new function: textscan.m, John W. Eaton, 2010/10/23
- Message not available
- Re: new function: textscan.m, Ben Abbott, 2010/10/23
- Re: new function: textscan.m, Liam Groener, 2010/10/23
- Re: new function: textscan.m,
Ben Abbott <=