bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18499: Possible mv race for hardlinks (rhbz #1141368 )


From: Boris Ranto
Subject: bug#18499: Possible mv race for hardlinks (rhbz #1141368 )
Date: Thu, 13 Nov 2014 17:06:13 +0100

On Thu, 2014-11-13 at 15:41 +0000, Pádraig Brady wrote:
> On 13/11/14 13:58, Boris Ranto wrote:
> > On Wed, 2014-09-24 at 17:18 +0100, Pádraig Brady wrote:
> >> On 09/18/2014 11:52 AM, Ondrej Vasik wrote:
> >>> Hi,
> >>> as reported in https://bugzilla.redhat.com/show_bug.cgi?id=1141368 ,
> >>> there is a possible race condition in mv in the case of hardlinks.
> >>>
> >>> Bug is reproducible even on ext4 filesystem (I guess it is filesystem
> >>> independent), even with the latest coreutils. There was already attempt
> >>> to fix the non-atomicity of mv for hardlinks
> >>> ( http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/12985 ), from
> >>> the current reproducer it looks like the fix is probably incomplete.
> >>
> >> The reason mv does the problematic unlink() is because it needs to simulate
> >> the rename, as rename() has this IMHO incorrect, though defined and 
> >> documented behavior:
> >>
> >>   "If oldpath and newpath are existing hard links referring to the same
> >>    file, then rename() does nothing, and returns a success status."
> >>
> >> For arbitration of the rename between separate processes,
> >> the kernel needs to be involved.  I noticed the very recent
> >> renameat2() system call added to Linux which supports flags.
> >> Miklos, do you think this is something that could be
> >> handled by renameat2() either implicitly or explicitly with a flag?
> >>
> >> thanks,
> >> Pádraig.
> >>
> >>
> >>
> > 
> > Hi all,
> > 
> > I've looked into this and I believe that the whole idea that 'mv a b'
> > where a and b are the same file (but different hard links) would unlink
> > one of the files is flawed -- I think we should return an error message
> > (something like we do for mv a a) and exit in this case. I'll add a
> > little more of my reasoning, here:
> > 
> > As Pádraig stated in his comment (and as is stated in rename man page):
> > 
> >   "If oldpath and newpath are existing hard links referring to the same
> >    file, then rename() does nothing, and returns a success status."
> > 
> > This is probably due to this documented behaviour from rename man page:
> > 
> >   "rename() renames a file, moving it between directories if
> >    required. Any other  hard links to the file (as created using
> >    link(2)) are unaffected."
> > 
> > I.e. rename does not touch other hard links directly from its definition
> > (hence, it can't do anything if it gets the same hard linked file
> > twice). This actually means that (without a locking mechanism) there is
> > no way for us to do mv a b for hard linked files atomically -- the two
> > separate manual unlinks would race no matter what we do.
> > 
> > The renameat2() function does not help here either (at least from what I
> > could find about it). It only allows us to do atomic switch of the files
> > which does not really help in this case.
> 
> A flag could be added for this case though right?
> I.E. to handle the case of racing renames,
> one file would always be left in place.
> In other words to support this functionality we would need
> the kernel support from renameat().
> 

Yes, technically it could. Maybe, it would be worth a try to propose it
for kernel.

> > Hence, I think that the best thing for us to do here is to print the
> > "mv: 'a' and 'b' are the same hard linked file' and exit with an error
> > code if we detect this behaviour. At least if we want to preserve the
> > atomicity of the operation.
> 
> We've three options that I see:
> 
> 1. Simulate but be susceptible to data loss races (what we do now)
> 2. Ignore rename() issue and leave both files present (what FreeBSD does)
> 3. Fail with a "files are identical" error (what Solaris does)
> 

Personally, I like the Solaris way the best (if we can't get the system
call that would assure the atomicity in this case) -- i.e. let the user
know that we can't do that. It is also in tune with what the link man
page says for hard links:

   "This  new  name  may be used exactly as the old one for any
    operation;"

As we error out on 'mv a a', we should probably also threat 'mv a b'
where a and b are the same file that way.

As opposed to FreeBSD solution, the user will know that he needs to run
'rm a' to accomplish what he actually wanted to do in the first place.

> Now 1 is also an "issue" when the source and dest files are on separate file 
> systems.
> I.E. mv will need to unlink() in that case.  So from a high level, mv is 
> foregoing the
> atomicity of the rename in this edge case to provide consistent behavior 
> across
> all cases.  The question boils down to whether atomicity in this edge case is 
> required.
> Could you expand on a use case that requires this atomicity?
> 

There actually is a real world use case. The original red hat bugzilla
(1141368) was filed because people were hitting this issue in a
distributed environment (via GlusterFS) -- i.e. two people were renaming
files at the same time from different locations.

-Boris

> thanks,
> Pádraig.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]