bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FYI: bug-fix: cp would fail to write through a dangling symlink


From: Paul Eggert
Subject: Re: FYI: bug-fix: cp would fail to write through a dangling symlink
Date: Thu, 14 Jun 2007 11:43:10 -0700
User-agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux)

Jim Meyering <address@hidden> writes:

> Thanks for taking the time.
> At first glance this looks good.
> As you say, it'll take some time (and more tests)
> to do this properly.

It turns out the existing tests already catch problems in this area,
it's just that I think the tests are buggy....

Here's a proposed patch for all this.  The basic idea is to use 'stat'
on a destination file only if we plan to copy to it as a regular file;
if we plan to do anything else with it (e.g., remove or rename it, or
create it as a directory or special file) we use 'lstat'.

This causes the behavior of 'cp' to change in some fairly obscure
cases that are covered in the tests.  So this proposed patch changes
the tests to match.

It also tries to document this issue better; it is a bit of a hairy
one.  To some extent the doc fix was the hardest part of the patch....

2007-06-14  Paul Eggert  <address@hidden>

        * NEWS: "cp" no longer considers a destination symlink to be the
        same as the referenced file when copying links or making backups.
        * src/copy.c (copy_reg): When following a symlink, used the
        followed name in later chown etc. requests, so that the created
        file is affected, rather than the symlink.  Use O_NOFOLLOW on
        source when not dereferencing symlinks; this avoids a race.
        Preserve errno correctly when doing multiple open attempts on the
        destination.
        (copy_internal): Follow destination symlinks only when copying a
        regular file and only when we don't intend to remove or rename the
        destination first, regardless of whether following source
        symlinks; this is because since POSIX and tradition (e.g.,
        FreeBSD) says we should ordinarily follow destination symlinks if
        the system calls would ordinarily do so.
        * src/copy.h (struct cp_options): Add comment that 'dereference'
        is only for source files.
        * src/cp.c (usage): Note that --derereference etc. are only for
        source files.
        (make_dir_parents_private): Follow symlinks, regardless of whether
        --dereference is specified, because these are destination symlinks.
        * tests/cp/same-file: Adjust tests to match revised behavior.
        Filter out perror output since it might vary from host to host.
        Use sed alone instead of also using echo.

2007-06-14  Paul Eggert  <address@hidden>

        * doc/coreutils.texi (cp invocation): Document the behavior better when
        the destination is a symlink.  Clarify source versus destination
        symlinks.  Describe the new behavior for destination symlinks.

diff --git a/NEWS b/NEWS
index 9db7902..f1b81f0 100644
--- a/NEWS
+++ b/NEWS
@@ -18,7 +18,12 @@ GNU coreutils NEWS                                    -*- 
outline -*-
 ** Bug fixes

   cp no longer fails to write through a dangling symlink
-  [bug introduced in coreutils-6.7]
+  [bug introduced in coreutils-6.7].  Also, 'cp' no longer considers a
+  destination symlink to be the same as the referenced file when
+  copying links or making backups.  For example, if SYM is a symlink
+  to FILE, "cp -l FILE SYM" now reports an error instead of silently
+  doing nothing.  The behavior of 'cp' is now better documented when
+  the destination is a symlink.

   cut now diagnoses a range starting with zero (e.g., -f 0-2) as invalid;
   before, it would treat it as if it started with 1 (-f 1-2).
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 4ba2fb9..eb6691d 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -6909,13 +6909,22 @@ By default, @command{cp} does not copy directories.  
However, the
 copy recursively by descending into source directories and copying files
 to corresponding destination directories.

-By default, @command{cp} follows symbolic links only when not copying
+When copying from a symbolic link, @command{cp} normally follows the
+link only when not copying
 recursively.  This default can be overridden with the
 @option{--archive} (@option{-a}), @option{-d}, @option{--dereference}
 (@option{-L}), @option{--no-dereference} (@option{-P}), and
 @option{-H} options.  If more than one of these options is specified,
 the last one silently overrides the others.

+When copying to a symbolic link, @command{cp} normally follows the
+link when creating or copying to a regular file, even if the link is
+dangling.  However, @command{cp} does not follow these links when
+creating directories or special files.  Also, when an option like
address@hidden or @option{--link} acts to rename or remove the
+destination before copying, @command{cp} renames or removes the
+symbolic link rather than the file it points to.
+
 By default, @command{cp} copies the contents of special files only
 when not copying recursively.  This default can be overridden with the
 @option{--copy-contents} option.
@@ -6995,10 +7004,10 @@ Equivalent to @option{--no-dereference 
--preserve=links}.
 @opindex --force
 When copying without this option and an existing destination file cannot
 be opened for writing, the copy fails.  However, with @option{--force}),
-when a destination file cannot be opened, @command{cp} then unlinks it and
+when a destination file cannot be opened, @command{cp} then removes it and
 tries to open it again.  Contrast this behavior with that enabled by
 @option{--link} and @option{--symbolic-link}, whereby the destination file
-is never opened but rather is unlinked unconditionally.  Also see the
+is never opened but rather is removed unconditionally.  Also see the
 description of @option{--remove-destination}.

 This option is independent of the @option{--interactive} or
@@ -7028,7 +7037,7 @@ Make hard links instead of copies of non-directories.
 @itemx --dereference
 @opindex -L
 @opindex --dereference
-Always follow symbolic links.
+Follow symbolic links when copying from them.

 @item -P
 @itemx --no-dereference
@@ -7036,7 +7045,8 @@ Always follow symbolic links.
 @opindex --no-dereference
 @cindex symbolic links, copying
 Copy symbolic links as symbolic links rather than copying the files that
-they point to.
+they point to.  This option affects only symbolic links in the source;
+symbolic links in the destination are always followed if possible.

 @item -p
 @itemx @address@hidden@var{attribute_list}]}
@@ -7126,8 +7136,8 @@ about each existing destination file.
 @cindex copying directories recursively
 @cindex recursively copying directories
 @cindex non-directories, copying as special files
-Copy directories recursively.  Symbolic links are not followed by
-default; see the @option{--archive} (@option{-a}), @option{-d},
+Copy directories recursively.  By default, do not follow symbolic
+links in the source; see the @option{--archive} (@option{-a}), @option{-d},
 @option{--dereference} (@option{-L}), @option{--no-dereference}
 (@option{-P}), and @option{-H} options.  Special files are copied by
 creating a destination file of the same type as the source; see the
diff --git a/src/copy.c b/src/copy.c
index b46221c..0e9f2d7 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -34,6 +34,7 @@
 #include "acl.h"
 #include "backupfile.h"
 #include "buffer-lcm.h"
+#include "canonicalize.h"
 #include "copy.h"
 #include "cp-hash.h"
 #include "euidaccess.h"
@@ -261,14 +262,19 @@ copy_reg (char const *src_name, char const *dst_name,
 {
   char *buf;
   char *buf_alloc = NULL;
+  char *name_alloc = NULL;
+  char const *followed_dest_name = dst_name;
   int dest_desc;
+  int dest_errno;
   int source_desc;
   mode_t src_mode = src_sb->st_mode;
   struct stat sb;
   struct stat src_open_sb;
   bool return_val = true;

-  source_desc = open (src_name, O_RDONLY | O_BINARY);
+  source_desc = open (src_name,
+                     (O_RDONLY | O_BINARY
+                      | (x->dereference == DEREF_NEVER ? O_NOFOLLOW : 0)));
   if (source_desc < 0)
     {
       error (0, errno, _("cannot open %s for reading"), quote (src_name));
@@ -298,6 +304,7 @@ copy_reg (char const *src_name, char const *dst_name,
   if (! *new_dst)
     {
       dest_desc = open (dst_name, O_WRONLY | O_TRUNC | O_BINARY);
+      dest_errno = errno;

       /* When using cp --preserve=context to copy to an existing destination,
         use the default context rather than that of the source.  Why?
@@ -353,21 +360,35 @@ copy_reg (char const *src_name, char const *dst_name,

   if (*new_dst)
     {
-      int open_flags = O_WRONLY | O_CREAT | O_BINARY;
-      dest_desc = open (dst_name, open_flags | O_EXCL,
+      int open_flags = O_WRONLY | O_CREAT | O_EXCL | O_BINARY;
+      dest_desc = open (dst_name, open_flags,
                        dst_mode & ~omitted_permissions);
+      dest_errno = errno;

       /* When trying to copy through a dangling destination symlink,
         the above open fails with EEXIST.  If that happens, and
         lstat'ing the DST_NAME shows that it is a symlink, repeat
-        the open call, but this time without O_EXCL.  */
-      if (dest_desc < 0 && errno == EEXIST)
+        the open call, but this time with the name of the final,
+        missing directory entry.  */
+      if (dest_desc < 0 && dest_errno == EEXIST)
        {
          struct stat dangling_link_sb;
          if (lstat (dst_name, &dangling_link_sb) == 0
              && S_ISLNK (dangling_link_sb.st_mode))
-           dest_desc = open (dst_name, open_flags,
-                             dst_mode & ~omitted_permissions);
+           {
+             /* FIXME: This is way overkill, since all that's needed
+                is to follow the symlink that is the last file name
+                component.  */
+             name_alloc =
+               canonicalize_filename_mode (dst_name, CAN_MISSING);
+             if (name_alloc)
+               {
+                 followed_dest_name = name_alloc;
+                 dest_desc = open (followed_dest_name, open_flags,
+                                   dst_mode & ~omitted_permissions);
+                 dest_errno = errno;
+               }
+           }
        }
     }
   else
@@ -375,7 +396,8 @@ copy_reg (char const *src_name, char const *dst_name,

   if (dest_desc < 0)
     {
-      error (0, errno, _("cannot create regular file %s"), quote (dst_name));
+      error (0, dest_errno, _("cannot create regular file %s"),
+            quote (dst_name));
       return_val = false;
       goto close_src_desc;
     }
@@ -567,7 +589,7 @@ copy_reg (char const *src_name, char const *dst_name,
       timespec[0] = get_stat_atime (src_sb);
       timespec[1] = get_stat_mtime (src_sb);

-      if (gl_futimens (dest_desc, dst_name, timespec) != 0)
+      if (gl_futimens (dest_desc, followed_dest_name, timespec) != 0)
        {
          error (0, errno, _("preserving times for %s"), quote (dst_name));
          if (x->require_preserve)
@@ -580,7 +602,7 @@ copy_reg (char const *src_name, char const *dst_name,

   if (x->preserve_ownership && ! SAME_OWNER_AND_GROUP (*src_sb, sb))
     {
-      switch (set_owner (x, dst_name, dest_desc,
+      switch (set_owner (x, followed_dest_name, dest_desc,
                         src_sb->st_uid, src_sb->st_gid))
        {
        case -1:
@@ -593,24 +615,24 @@ copy_reg (char const *src_name, char const *dst_name,
        }
     }

-  set_author (dst_name, dest_desc, src_sb);
+  set_author (followed_dest_name, dest_desc, src_sb);

   if (x->preserve_mode || x->move_mode)
     {
-      if (copy_acl (src_name, source_desc, dst_name, dest_desc, src_mode) != 0
+      if (copy_acl (src_name, source_desc, followed_dest_name, dest_desc, 
src_mode) != 0
          && x->require_preserve)
        return_val = false;
     }
   else if (x->set_mode)
     {
-      if (set_acl (dst_name, dest_desc, x->mode) != 0)
+      if (set_acl (followed_dest_name, dest_desc, x->mode) != 0)
        return_val = false;
     }
   else if (omitted_permissions)
     {
       omitted_permissions &= ~ cached_umask ();
       if (omitted_permissions
-         && fchmod_or_lchmod (dest_desc, dst_name, dst_mode) != 0)
+         && fchmod_or_lchmod (dest_desc, followed_dest_name, dst_mode) != 0)
        {
          error (0, errno, _("preserving permissions for %s"),
                 quote (dst_name));
@@ -633,6 +655,7 @@ close_src_desc:
     }

   free (buf_alloc);
+  free (name_alloc);
   return return_val;
 }

@@ -1137,7 +1160,21 @@ copy_internal (char const *src_name, char const 
*dst_name,

   if (!new_dst)
     {
-      if (XSTAT (x, dst_name, &dst_sb) != 0)
+      /* Regular files can be created by writing through symbolic
+        links, but other files cannot.  So use stat on the
+        destination when copying a regular file, and lstat otherwise.
+        However, if we intend to unlink or remove the destination
+        first, use lstat, since a copy won't actually be made to the
+        destination in that case.  */
+      if ((((S_ISREG (src_mode)
+            || (x->copy_as_regular
+                && ! (S_ISDIR (src_mode) || S_ISLNK (src_mode))))
+           && ! (x->move_mode || x->symbolic_link || x->hard_link
+                 || x->backup_type != no_backups
+                 || x->unlink_dest_before_opening))
+          ? stat (dst_name, &dst_sb)
+          : lstat (dst_name, &dst_sb))
+         != 0)
        {
          if (errno != ENOENT)
            {
@@ -1151,7 +1188,7 @@ copy_internal (char const *src_name, char const *dst_name,
        }
       else
        { /* Here, we know that dst_name exists, at least to the point
-            that it is XSTAT'able.  */
+            that it is stat'able or lstat'table.  */
          bool return_now;
          bool unlink_src;

diff --git a/src/copy.h b/src/copy.h
index 0d6233f..23cde2b 100644
--- a/src/copy.h
+++ b/src/copy.h
@@ -87,7 +87,7 @@ struct cp_options
      them, symbolic links,) as if they were regular files.  */
   bool copy_as_regular;

-  /* How to handle symlinks.  */
+  /* How to handle symlinks in the source.  */
   enum Dereference_symlink dereference;

   /* If true, remove each existing destination nondirectory before
diff --git a/src/cp.c b/src/cp.c
index 35c487b..3378761 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -182,14 +182,14 @@ Mandatory arguments to long options are mandatory for 
short options too.\n\
   -f, --force                  if an existing destination file cannot be\n\
                                  opened, remove it and try again\n\
   -i, --interactive            prompt before overwrite\n\
-  -H                           follow command-line symbolic links\n\
+  -H                           follow command-line symbolic links in SOURCE\n\
 "), stdout);
       fputs (_("\
   -l, --link                   link files instead of copying\n\
-  -L, --dereference            always follow symbolic links\n\
+  -L, --dereference            always follow symbolic links in SOURCE\n\
 "), stdout);
       fputs (_("\
-  -P, --no-dereference         never follow symbolic links\n\
+  -P, --no-dereference         never follow symbolic links in SOURCE\n\
 "), stdout);
       fputs (_("\
   -p                           same as --preserve=mode,ownership,timestamps\n\
@@ -394,7 +394,7 @@ make_dir_parents_private (char const *const_dir, size_t 
src_offset,

   *attr_list = NULL;

-  if (XSTAT (x, dst_dir, &stats))
+  if (stat (dst_dir, &stats) != 0)
     {
       /* A parent of CONST_DIR does not exist.
         Make all missing intermediate directories. */
@@ -414,7 +414,7 @@ make_dir_parents_private (char const *const_dir, size_t 
src_offset,
          *attr_list = new;

          *slash = '\0';
-         if (XSTAT (x, dir, &stats))
+         if (stat (dir, &stats) != 0)
            {
              mode_t src_mode;
              mode_t omitted_permissions;
@@ -427,7 +427,7 @@ make_dir_parents_private (char const *const_dir, size_t 
src_offset,
                 make_dir_parents_private creates only e_dir/../a if
                 ./b already exists. */
              *new_dst = true;
-             src_errno = (XSTAT (x, src, &stats) != 0
+             src_errno = (stat (src, &stats) != 0
                           ? errno
                           : S_ISDIR (stats.st_mode)
                           ? 0
diff --git a/tests/cp/same-file b/tests/cp/same-file
index 44d5dd7..8e0593e 100755
--- a/tests/cp/same-file
+++ b/tests/cp/same-file
@@ -89,9 +89,15 @@ for args in 'foo symlink' 'symlink foo' 'foo foo' 'sl1 sl2' 
'foo hardlink'; do
        cp $options $args 2>_err
        echo $? $options

-       # Normalize the program name in the error output,
+       # Normalize the program name and diagnostics in the error output,
        # and put brackets around the output.
-       test -s _err && echo "[`sed 's/^[^:][^:]*:/cp:/' _err`]"
+       if test -s _err; then
+         sed '
+           s/^[^:]*:\([^:]*\).*/cp:\1/
+           1s/^/[/
+           $s/$/]/
+         ' _err
+        fi
        # Strip off all but the file names.
        ls="`ls -gG --ignore=_err . \
            | sed \
@@ -128,13 +134,13 @@ cat <<\EOF > $expected
 0 -bd (foo symlink symlink.~1~ -> foo)
 0 -bf (foo symlink symlink.~1~ -> foo)
 0 -bdf (foo symlink symlink.~1~ -> foo)
-0 -l (foo symlink -> foo)
+1 -l [cp: cannot create link `symlink'] (foo symlink -> foo)
 0 -dl (foo symlink -> foo)
-0 -fl (foo symlink -> foo)
+0 -fl (foo symlink)
 0 -dfl (foo symlink)
-0 -bl (foo symlink -> foo)
+0 -bl (foo symlink symlink.~1~ -> foo)
 0 -bdl (foo symlink symlink.~1~ -> foo)
-0 -bfl (foo symlink -> foo)
+0 -bfl (foo symlink symlink.~1~ -> foo)
 0 -bdfl (foo symlink symlink.~1~ -> foo)

 1 [cp: `symlink' and `foo' are the same file] (foo symlink -> foo)
@@ -179,10 +185,10 @@ cat <<\EOF > $expected
 0 -bd (foo sl1 -> foo sl2 -> foo sl2.~1~ -> foo)
 0 -bf (foo sl1 -> foo sl2 sl2.~1~ -> foo)
 0 -bdf (foo sl1 -> foo sl2 -> foo sl2.~1~ -> foo)
-0 -l (foo sl1 -> foo sl2 -> foo)
+1 -l [cp: cannot create link `sl2'] (foo sl1 -> foo sl2 -> foo)
 0 -fl (foo sl1 -> foo sl2 -> foo)
-0 -bl (foo sl1 -> foo sl2 -> foo)
-0 -bfl (foo sl1 -> foo sl2 -> foo)
+0 -bl (foo sl1 -> foo sl2 -> foo sl2.~1~ -> foo)
+0 -bfl (foo sl1 -> foo sl2 -> foo sl2.~1~ -> foo)

 1 [cp: `foo' and `hardlink' are the same file] (foo hardlink)
 1 -d [cp: `foo' and `hardlink' are the same file] (foo hardlink)
M ChangeLog
M NEWS
M doc/ChangeLog
M doc/coreutils.texi
M src/copy.c
M src/copy.h
M src/cp.c
M tests/cp/same-file
Committed as 4708434be83d5857c53782c2779928b8914a681d




reply via email to

[Prev in Thread] Current Thread [Next in Thread]