bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Threaded versions of cp, mv, ls for high latency / parallel filesystems?


From: Andrew McGill
Subject: Threaded versions of cp, mv, ls for high latency / parallel filesystems?
Date: Sat, 8 Nov 2008 18:42:37 +0200
User-agent: KMail/1.9.9

Greetings coreutils folks,

There are a number of interesting filesystems (glusterfs, lustre? ... NFS) 
which could benefit from userspace utilities doing certain operatings in 
parallel.  (I have a very slow glusterfs installation that makes me think 
that some things can be done better.)

For example, copying a number of files is currently done in series ...
        cp a b c d e f g h dest/
but, on certain filesystems, it would be roughly twice as efficient if 
implemented in two parallel threads, something like:
        cp a c e g dest/ &
        cp b d f h dest/
since the source and destination files can be stored on multiple physical 
volumes.  

Simlarly, ls -l . will readdir(), and then stat() each file in the directory.  
On a filesystem with high latency, it would be faster to issue the stat() 
calls asynchronously, and in parallel, and then collect the results for 
display.  (This could improve performance for NFS, in proportion to the 
latency and the number of threads.)


Question:  Is there already a set of "improved" utilities that implement this 
kind of technique?  If not, would this kind of performance enhancements be 
considered useful?  (It would mean introducing threading into programs which 
are currently single-threaded.)


To the user, it could look very much the same ...
        export GNU_COREUTILS_THREADS=8
        cp   # manipulate multiple files simultaneously
        mv   # manipulate multiple files simultaneously
        ls   # stat() multiple files simultaneously

One could also optimise the text utilities like cat by doing the open() and 
stat() operations in parallel and in the background -- userspace read-ahead 
caching.  All of the utilities which process mutliple files could get 
small speed boosts from this -- rm, cat, chown, chmod ... even tail, head, 
wc -- but probably only on network filesystems.

&:-)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]