gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: tla1.2 on cygwin


From: Aaron Bentley
Subject: Re: [Gnu-arch-users] Re: tla1.2 on cygwin
Date: Sun, 14 Mar 2004 13:34:46 -0500
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4

Stefan Monnier wrote:

I'd rather assume that if ctime and size have not changed, then the file
hasn't changed, even though it's possibly incorrect.
You meant mtime, right? :-)

Actually no: I meant ctime.
mtime can be tweaked with touch, so you can't rely on it if you want to
be safe.

That's the entire point. ctime doesn't change when the file is modified. mtime does, unless you're using special tools. Modification == corruption, so that's what we care about.

But what if the mtime and size haven't changed, but the file an utterly
different file?

That can happen while keeping the inode constant as well.
Only by using tools that deliberately set mtime. If the mtime changes or size changes, you've updated the contents. If the inode changes, you've updated the name.

In fact, renames on arch-managed files are very hard to detect without
looking at inodes, since all of them are created within a few seconds
of eachother.

Looking at the ,,inode-sigs files, you can easily find all the files with
the same mtime and size, so maybe you then do something clever, like "assume
there's no swap" (which means that as long as there's no file added or
missing from the list of files with same mtime and same size, you know
there has been no change).
What I mean by arch-managed files is "revision libraries and pristines". Sorry for the ambiguity.

All of the "changes" and "file diffs" will produce faulty output if the
basis for comparison is corrupt.

Sure.  Corruption can and does happen without changing any inode number,
mtime, or size.  We had better start by deciding from which kind of corruption
we want to protect ourselves, otherwise "corruption" is a moot argument
since you'd then have to constantly check everything (including tla, the
libc you're linking against, locking everything to avoid race conditions
(the revlib could get corrupted after you checked the inode but before you
read the file)).
Corruption of the revlib is not uncommon if you use hardlinked trees. It must be detected. These types of corruptions can be detected by checking inode sigs, since they usually change mtime and size. Failure to check it will lead to corrupt archives.

It's even less essential to detect corruption of file BAR when I do
`tla file-diffs FOO'.
I haven't looked at the code, but I imagine we only look at the reference
version of FOO.

I doubt it.  On a small project, `tla file-diffs' is essentially
instantaneous, whereas on an Emacs tree, it takes several seconds.
Okay, I'm wrong there.  Perhaps file-diffs can be more fine-grained.

Performing an MD5sum is much slower than a stat().  It would make "tla
changes" orders of magnitude slower.

Why would it be?  If you're not going to read the file, why would you check
its `stat' for corruption?  You only need to compute the MD5 on files that
you actually read, so it shouldn't cost much.
To compute a changeset using "changes", you need to compare every file against something. Typically, we check against the inode signatures as a shortcut, but if that doesn't work, we fall back to checking against the revision library or pristine. In that case, (i.e. YOUR case) we need to check every file in the tree.

Aaron





reply via email to

[Prev in Thread] Current Thread [Next in Thread]