coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] cp --parents parallel


From: Bob Proulx
Subject: Re: [coreutils] cp --parents parallel
Date: Mon, 18 Oct 2010 11:20:03 -0600
User-agent: Mutt/1.5.20 (2009-06-14)

Rob Gom wrote:
> this is my first post to this list, which I think is the best place to
> ask. I have searched web, but couldn't found an answer.

Yes.  This is a great place to discuss coreutils issues.  The
coreutils mailing list is for general discussion.  The bug-coreutils
mailing list is attached to a bug tracking system and messages there
are tracked as bug issues in the database.  Please keep the coreutils
address in the recipient list when you follow-up so that the group may
see and participate in the discussion.

> In one of my makefiles I use parallel execution. One of the targets
> executed simultaneously tries to copy files to one root directory. And
> sometimes it fails.

Thank you for the nice example script.  That makes things much easier
to understand what is happening.

> But it failed immediately with:
> $ bash ./test.sh
> cp: cp: cp: cannot make directory `/tmp/tmp.tbAF9E58QA/a'cannot make
> directory `/tmp/tmp.tbAF9E58QA/a'cannot make directory
> `/tmp/tmp.tbAF9E58QA/a': File exists
> : File exists
> : File exists
> ^C

There are race conditions any time two processes execute a sequential
list of non-atomic operations.  Here both --update and --parents are
going to be problematic.  Both of those operations stat(2) the target
and depending upon the result take different actions.  The --parents
action will report errors if it cannot create the parent directory.

> I assume that all spawned processes found that there's no target
> directory and attempted to create it. But when cp first assumed that
> there's no directory
> and it was created in the meantime, it failed.

Correct.

> Is my script correct? Is cp behaviour correct? How can I avoid the
> problem in the future?

Both --update and --parents are GNU extensions.  They are not covered
by the POSIX standard.  Therefore there isn't a canoically correct
behavior.

Looking over the behavior of --parents I think that reporting an error
if it tries to make a directory that was just created is too strict.
I think it should accept that a directory that didn't exist a moment
before may be created between stat'ing for it and trying to create it.
I think the behavior should be like 'mkdir --parents'.

> I have reproduced the problem with the following bash script:
> #!/bin/bash
> while true; do
>     directory=`mktemp -d`
>     (sleep 1s; cp --update --parents a/b/c/xxx.txt $directory) &
>     (sleep 1s; cp --update --parents a/b/d/xxx.txt $directory) &
>     (sleep 1s; cp --update --parents a/b/e/xxx.txt $directory) &
>     (sleep 1s; cp --update --parents a/b/f/xxx.txt $directory) &
>     sleep 10s
>     rm -rf $directory
>     sleep 10s
> done

Personally I usually try to keep to POSIX standard behavior and so
would avoid using --update and --parents.  [And use sleep with
integers only.  I would also use $(...) instead of `...` too.]  To do
this operation using only standard behavior you would need something
like this (untested) example:

#!/bin/sh
while true; do
  tmpdir=$(mktemp -d) || exit 1

  subdir="a/b/c"
  mkdir -p "$tmpdir/$subdir"
  (sleep 1; cp "$subdir/xxx.txt" "$tmpdir/$subdir/") &

  subdir="a/b/d"
  mkdir -p "$tmpdir/$subdir"
  (sleep 1; cp "$subdir/xxx.txt" "$tmpdir/$subdir/") &

  subdir="a/b/e"
  mkdir -p "$tmpdir/$subdir"
  (sleep 1; cp "$subdir/xxx.txt" "$tmpdir/$subdir/") &

  subdir="a/b/f"
  mkdir -p "$tmpdir/$subdir"
  (sleep 1; cp "$subdir/xxx.txt" "$tmpdir/$subdir/") &

  sleep 3
  rm -rf "$tmpdir"
done

If you are dealing with multiple concurrent processes that all modify
the same files in the same directory then I have to ask if that is
really the best way of doing things?  At the least you will need to
take appropriate steps to ensure that the result of race conditions is
a correct result.

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]