bug-cpio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-cpio] fallocate() calls in cpio?


From: Phil Karn
Subject: [Bug-cpio] fallocate() calls in cpio?
Date: Fri, 19 Nov 2010 04:04:30 -0800
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6

Has anyone considered having cpio call fallocate() when running on
Linux? This is a fairly new system call that pre-allocates file space in
those file systems that support it. ext2 and ext3 do not, but ext4 and
xfs and some others do.

Both ext4 and xfs are extent-based filesystems, meaning that each file
is stored in one or more contiguous regions called extents. (ext2 and
ext3 exhaustively list every disk block in a file whether they are
contiguous or not).

There are significant performance advantages to storing each file in a
single extent, and ext4 and xfs try very hard to do so. But this is
difficult when the file system doesn't know in advance how big the file
will be, especially when the system is close to full and free space is
highly fragmented. The fallocate() call was therefore added for the
optional use of applications that do know in advance how big a file they
will write. An archive extractor like cpio -i does know how big each
file it creates will be.

In Linux, calling fallocate() on a file system that doesn't support it
has no effect. So I recommend that cpio always call as each file is
extracted, whether or not the file system supports it. I also recommend
the FALLOC_FL_KEEP_SIZE flag, which allocates space without changing the
size of the file until it is actually written.

There is a related library call, posix_fallocate(), that may be present
even when the underlying file system doesn't have a native allocate
call. In that case, it merely writes the specified number of zeroes to
the file so that later overwrites with real data cannot fail due to lack
of disk space, and the file size is changed. I do not recommend using
posix_fallocate() as it could slow things down considerably while
providing no real benefit.

I'm running local test versions of cpio, tar and rsync with fallocate()
calls added, and they seem to work as expected.

Comments?

Phil



reply via email to

[Prev in Thread] Current Thread [Next in Thread]