gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: in-tree pristines fatally wounded (merge-fest e


From: David Allouche
Subject: Re: [Gnu-arch-users] Re: in-tree pristines fatally wounded (merge-fest etc)
Date: Fri, 5 Dec 2003 13:24:25 +0100
User-agent: Mutt/1.5.4i

On Thu, Dec 04, 2003 at 11:54:37AM +0100, David Allouche wrote:
> On Thu, Dec 04, 2003 at 07:07:15AM +1100, Robert Collins wrote:
> > there is a trivial heuristic:
> > if any file in a revision's sources in a library has a link count of 1,
> > then that revision hasn't been checked out. (modulo the slow rot that
> > occurs from update and replay breaking links).
> 
> Unless of course the next revision is only renames, has zero change
> (e.g. a version-0 revision) or is a continuation (i.e. a base-0). But
> this might be considered a negligible space overhead.

Rah... said something wrong again...

Except in pathological cases, for a revision, at least on file will be
either shared with the previous revision (in the library) or the next
revision (in the library).

There are a few discrete case where you can have files with a link count
of only 1 (all cases assume there is no hardlinked working tree).

  A. File changed in a revision, when that revision is the last of its
  development line (in the library).

  B. File changed in the next revision (in the library), when the
  current revision is the first of its development line (in the
  library).

  C. File changed in a revision and the next revision (in the library).

By applying this heuristic to detect unused library revisions, deleting
those revisions, and starting over again, you _might_ (that would need
some work to tell if it is true of false) get the same result as my
more complicated heuristic.

Well... maybe that can be useful. If implemented in a clever enough
manner, you may get a link counting cost which is linear to the number
of deleted library.

The algorithm could be something like:

  0. Get the development lines forest.

  1. On every root, count links (case B), delete if appropriate.

  2. On every leaf, count links (case A), delete if appropriate.

  3. Repeat 1 and 2 until the library is stable. We can optimize by
  never counting the links in the same revision twice. After counting the
  revision is either deleted immediately or cannot be deleted.

  4. Count links in all revisions which have not been counted (case C)
  and delete if appropriate.

  5. Count links the parent revision of every revision which have just
  been deleted. Delete if appropriate.

  6. Repeat 5 until the library is stable.

Link counting can be improved heuristically by not counting links in
Arch metadata (ids, patchlogs, etc).

What is remaining, is proving this algorithm stands w.r.t. to the other
(more complicated) heuristic.

-- 
                                                            -- ddaa




reply via email to

[Prev in Thread] Current Thread [Next in Thread]