octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #54100] fread using SKIP larger than zero is e


From: Michael Leitner
Subject: [Octave-bug-tracker] [bug #54100] fread using SKIP larger than zero is extremely slow
Date: Wed, 13 Jun 2018 04:55:49 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0

Follow-up Comment #4, bug #54100 (project octave):

Of course the actual reading of the data is buffered. The converse would imply
that for every double you need to initiate a new communication with the hard
disk, and this would probably correspond to a slowing-down of rather a factor
10^5 than the factor 100 (on my computer) for the present case.

I am getting lost in this convoluted code, but it seems that it could be
simplified considerably: the reading is done in line 6623, where a buffer of
size input_buf_size is read into a newly allocated char buffer, which is then
pushed into a list. The size of this buffer is just input_buf_elts (line
6594), which for skip==0 is comfortably large (line 6583) or even the whole
file (line 6585), but otherwise it is an ominous block_size (line 6588).
However, as the skipping is indeed done between the reads (line 6655), this
has to mean that block_size is necessarily equal to 1. Or am I
misunderstanding something?

Of course your very last suggestion in comment #1 would be an improvement: as
it is, for skip>0 you do two tells and three seeks per read element. It would
be better to move this out of the loop, initially do one seek to the end, get
the position, compute the number of elements to be read, seek back to the
beginning, read all the elements in a for loop, and finally position the file
pointer. Then you have just one seek per read element, and you do not need to
rely on flags set by the library.

However, you still have one read and one seek per read element. So even if
they are buffered, function calls always cost you. Going from one read, three
seeks, and two tells to one read and one seek would give you a factor three
and not more, I would guess, reducing the present factor 100 to 33. 

So my suggestion would be to use large buffers as with skip==0 (increased by
the skipped bytes) whenever skip is not too large, after line 6623 insert the
line


for (int i=0;i<input_buf_elts;i++)
  for (j=0;j<input_elt_size;j++)
    input_buf[i*input_elt_size+j]=input_buf[i*(input_elt_size+skip)+j];


delete the whole section 6635-6659, and insert a last line to position the
file pointer correctly (if this should be necessary). Then reading a file of
given size to the end should take the same time whether skip==0 or not. 

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?54100>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]