[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: better buffer size for copy

From: Phillip Susi
Subject: Re: better buffer size for copy
Date: Mon, 21 Nov 2005 00:45:40 -0500
User-agent: Mozilla Thunderbird 1.0.7 (X11/20051010)

What would such network filesystems report as their blocksize? I have a feeling it isn't going to be on the order of a MB. At least for local filesystems, the ideal transfer block size is going to be quite a bit larger than the filesystem block size ( if the filesystem is even block oriented... think reiser4, or cramfs ). In the case of network filesystems, they should be performing readahead in the background between small block copies to keep the pipeline full. As long as the copy program isn't blocked elsewhere for long periods, say in the write to the destination, then the readahead mechanism should keep the pipeline full. Up to a point, using larger block sizes saves some cpu by lowering the number of system calls. After a certain point, the copy program can start to waste enough time in the write that the readahead stops and stalls the pipeline. If you want really fast copies of large files, then you want to send down multiple overlapped aio ( real aio, not the glibc threaded implementation ) O_DIRECT reads and writes, but that gets quite complicated. Simply using blocking O_DIRECT reads into a memory mapped destination file buffer performs nearly as well, provided you use a decent block size. On my system I have found that 128 KB+ buffers are needed to keep the pipeline full because I'm using a 2 disk raid0 with a 64k stripe factor. As a result, blocks smaller than 128 KB only keep one disk going at a time. That's probably getting a bit too complicated though for this conversation. If we are talking about the conventional blocking cached read, followed by a blocking cached write, then I think you will find that using a buffer size of several pages ( say 32 or 64 KB ) will be MUCH more efficient than 1024 bytes ( the typical local filesystem block size ), so using st_blksize for the size of the read/write buffer is not good. I think you may be ascribing meaning to st_blksize that is not there.

Robert Latham wrote:

In local file systems, i'm sure you are correct.  If you are working
with a remote file system, however, the optimal size is on the order
of megabytes, not kilobytes.  For a specific example, consider the
PVFS2 file system, where the plateau in "blocksize vs. bandwitdh" is
two orders of magnitude larger than 64 KB.  PVFS2 is a parallel file
system for linux clusters.  I am not nearly as familiar with Lustre,
GPFS, or GFS, but I suspect those filesystems too would benefit from
block sizes larger than 64 KB.
Are you taking umbrage at the idea of using st_blksize to direct how
large the transfer size should be for I/O?  I don't know what other
purpose st_blksize should have, nor are there any other fields which
are remotely valid for that purpose. Thanks for your feedback. ==rob

reply via email to

[Prev in Thread] Current Thread [Next in Thread]