[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] RE: cp command performance

From: Bob Proulx
Subject: Re: [coreutils] RE: cp command performance
Date: Thu, 23 Dec 2010 10:44:45 -0700
User-agent: Mutt/1.5.20 (2009-06-14)

Hemant Rumde wrote:
> Lets discuss on "cp A1 A1.bk". Correct me if I am wrong. 
> In this cp, OS needs to cache all A1.bk data blocks from storage
> to overwrite with A1 block. I guess, some time would be 
> utilized for this.  

The program opens the source file for reading (with O_RDONLY) and
opens the destination file for writing (with O_WRONLY|O_TRUNC).  If
there is an existing destination file then the operating system will
truncate the destination, reducing the entire file size to zero.

Internally all of the blocks will need to be returned to free space.
This may be done immediately or it may be queued for later action and
garbage collection and is an internal system implementation of the
filesystem in the kernel.  The same thing happens when the file is
removed with 'rm'.  Removing and freeing the disk blocks on a large
file may take a significant amount of time on some filesystems.  As I
recall ext3 in particular takes some time to do this operation.

With the additional information that you are using a networked
fileserver I would try to benchmark how long removing large files
take.  If it takes a long time then doing that in the background may
improve the overall time.  Of course if that operation is fast for you
already then it shouldn't be optimized further.

At this point I would like to make a side note.  If any other process
on the system has an open file handle on the file then the file will
have a non-zero reference count.  When overwriting the file with
'open("dst",O_WRONLY|O_TRUNC)' this file will be truncated to zero and
all disk blocks freed.  But when removing the file as long as there is
a non-zero reference count the file will not be removed.  The
filesystem will only remove the file after the last file handle to it
has been closed.  This often confuses people who have a large log file
and then remove the file expecting to free disk space but find that
the disk space is still in use until they kill the daemon.  (That
killing of the daemon happens at reboot leading some people to believe
that you must reboot but in reality you just need to cause the file to
be closed.)  When doing this over NFS it gets messy since it depends
upon which operations are done on which clients.  This is one source
of those .nfs* files.  That isn't precisely what is happening here but
worth the note.

> However, if A1.bk is new, then it would take free data 
> Blocks from super block. I guess, this should be faster.

I can't convince myself which would be faster.  I think when there is
a very large amount of disk space to be copied such as your 60G
(larger than most system ram buffer cache, although "HP storage"
doesn't really bound it) then any all time will be dominated by the
time needed to copy that data.  I think small effects will be
overwhelmed and insignificant.

Also if you are copying over NFS then external influences of the
network will have additional effects.  The switches and routers
between will be involved.  I have seen block size cause a large
difference depending upon the network hardware.

> Apart from this, read/write hits can make some difference
> in performance. When you use dd, I guess most of your data
> would be in buffer-cache and read-hit rate would be more

You can set a different block size.  You might find that a particular
block size will have significantely better results when copying across
the network and NFS to remote storage.

Personally I think unless there was very good reason (such as a large
verified by benchmark performance difference) I would use the simple
copy without doing anything special.  But for 60G I would use an
optimization if the benchmarks found a better performing solution.

> And very few calls would go to backend storage. 

I didn't understand this comment.

> Does this make any sense?

I think you should run some benchmarks in your environment.  I know
the mailing list would be interested in your findings.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]