monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Hash collisions resiliency


From: Jon Bright
Subject: Re: [Monotone-devel] Hash collisions resiliency
Date: Wed, 13 Apr 2005 21:20:57 +0200
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)

J C Lawrence wrote:

If you do, then you hopefully also realise that there's not very much
point in considering the answers to these questions...

No, I don't understand that there's not much point.  Just because
something is improbable does not mean it will not happen.

But if something's so incredibly improbable that a hardware failure whilst also winning the lottery is much, much more likely, it's not worth thinking about. Or are you proposing that Monotone also try to detect hardware failures in a way other than throwing invariant failures showing their effects?

Let's say the chance of a hardware failure which invisibly and silently corrupts your monotone DB is 1 in 2^40 (it's much more likely, but let's go with that). A nearby lottery webpage tells me that the chances of getting the top prize are 1 in 139,838,160 or a little more than 2^27, so let's go with 2^28. The chances of both of these events happening together is therefore 2^68. This makes the chances of you getting an invisibly-corrupting hardware failure *and* winning the lottery, even after making those things seem less likely than they actually are still 2^12 or 4096 times more likely than getting an SHA hash collision.

Very large chunks of the historical source bases for HP-UX and IRIX are
no longer recoverable due to silent NFS corruption of RCS ,v files.  It
wasn't detected due to RCS' use of reverse diffs.  (Queue SCCS forward
diff evangelism) I'd like an SCM system which tells me in unequivocal
terms when something critical goes really bad, even if it is vanishingly
improbable.

You could have monotone try to check for SHA collisions when adding files or revisions. This would slow things down by quite a bit since it'd be necessary to check file contents against one another rather just checking if the SHA hashes match. You could check file lengths against one another - this wouldn't slow things down that much.

To be specific: I or someone on the team would have to notice these
facts through simple manual observation of some sort of unexpected
behaviour, or would there be the equivalent of Bubba the Neanderthal
whacking me upside the head with a clue-by-four and yelling, "HEY BOZO,
YOU HAVE A HASH COLLISION!"?

I can't say which, but you'd probably hit some invariant failures of some sort at some point pretty soon afterward. These wouldn't say "hash collision", but they would say you have something pretty broken in your DB.

--
Jon Bright
Silicon Circus Ltd.
http://www.siliconcircus.com




reply via email to

[Prev in Thread] Current Thread [Next in Thread]