gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] the state of the union


From: Greg Hudson
Subject: Re: [Gnu-arch-users] the state of the union
Date: Thu, 19 Aug 2004 03:42:54 -0400

On Wed, 2004-08-18 at 18:23, Tom Lord wrote:
>     >> Does it require a fancy delta-compressed version-transactioned
>     >> filesystem (Subversion)?
[...]
>     >> The answer is that the Subversion approach can deliver lower
>     >> command-line latency for some operations but that Arch delivers
>     >> lower administration costs, better scalability, and higher
>     >> throughput.
[I wrote about FSFS.]
> I was comparing svn's BDB-FS to arch's combination of [...]
> I'm comparing them in a broad engineering sense, considering [...]

Perhaps that's what you were comparing in your head, but what you wrote
down was that brute-force techniques can intrinsically yield better
scalability, higher throughput, and lower administration costs than a
fancy delta-compressed version-transacted filesystem.  I think FSFS
proves that claim wrong.

If your real point is that Subversion took a wrong turn by using BDB, I
won't argue with you.  (In fact, in "undiagnosing", I very explicitly
agreed with you on that point.)  If your real point is that Subversion's
FS design is harder to implement, it's possible that you're right.  But
you didn't appear to be talking about implementation details.

> and avoiding local filesystem things like `flock'

"flock" isn't at all restricted to local filesystems, although there
certainly exist remote filesystems which can't swing it.  But this is
definitely within the realm of implementation details.

> One thing I noticed while skimming the FSFS design document
> ("structure") is that some of the files in your back-end are
> indefinately mutable (the one that caught my eye was something about
> "revision properties", I believe).

> Mutable files like that complicate replication, backups, and integrity
> checking, at least.

The rev-prop files are the only indefinitely mutable component.  There's
no requirement that rev-props be mutable for Subversion to work (in
fact, by default Subversion prohibits changing of rev-props after a
commit); if you don't allow it, you can't do things like fix mistakes in
a commit log after the fact.  Rev-prop files are tiny, so they don't
really complicate replication and backups.  Since arch doesn't have an
equivalent concept, the rev-prop files can be ignored for the purposes
of this discussion.

> One virtue of arch's approach is that the core archive is, in essense,
> a (partially ordered) transaction journal and nothing more.   Each
> commit-like operation bundles up the parameters of its transaction,
> stores that bundle in the archive --- and that's it, the commit is
> done.

In FSFS as well, a commit is finalized by bundling up the transaction
directory into a file and storing that in the revs directory, after
which time the file never changes.

> Client-side caches and memos are a flexible solution that scales
> arbitrarily with the number of clients.

Perhaps.

Here's something I do often: I find a bug in one of the dozens of
upstream programs I maintain builds of, and I narrow it down to a
particular source file.  My first question is "is there an upstream
fix?"  So I ask the upstream CVS repository for a log (or in some cases
a blame annotation) of the upstream file.  I'm not going to have a
well-populated client-side cache for the given program.  The upstream
repository would probably rather not serve me the project's entire
version history, and I certainly would rather not have my client pore
through that entire history.

Perhaps that sort of thing isn't common enough to be a concern, but I
think that's still up for debate.  Especially when, as far as I know, no
modern open-source version control system has been adopted by a big and
highly visible project.  (Where "big" has to include depth and
complexity of history as well as raw size.  gcc qualifies; a Linux
distribution generally does not.)

[In a sub-thread about the advantages and disadvantages of dedicated
servers versus dumb file transports:]
> Worst of all, though, how are your users supposed to _recognize_ when
> a Subversion archive has been corrupted?

Uh, does that have *anything* to do with the debate at hand?  The answer
to that would seem to depend on whether developers are signing their
commits, not whether the archive is accessed through a dedicated server
or a dumb file transport.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]