[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Monotone-devel] [patch] heights
From: |
Thomas Moschny |
Subject: |
[Monotone-devel] [patch] heights |
Date: |
Fri, 29 Sep 2006 10:36:54 +0200 |
User-agent: |
KMail/1.9.4 |
Hi all,
The nvm.heights branch should be landed on mainline anytime now, so I am going
to show it here. It's a mid-sized patch: 18 files changed, 844 insertions,
179 deletions.
The said branch lets monotone add one more cached piece of data per revision
to the database, called height. I sent an email to the list describing the
basic idea of heights some time ago, but meanwhile the idea has a bit
evolved. There's a page in monotone's wiki that describes the concept:
http://venge.net/monotone/wiki/RevisionNumbering . In short, heights are
caching the result of doing a toposort of all revisions, thus making a
subsequent topological sort much cheaper. Note that heights are not ment to
be replacing revision ids in any way. They are purely local cached data used
to carry out some optimizations.
So what does the patch do besides adding a rev_height class? Unfortunately, we
have to modify the db schema again, and a good amount of changes has to do
with that. Then, it renames regenerate_rosters to regenerate_caches, because
that cmd not only regenerates the cached rosters, but now also the cached
heights, and in the future might update other cached data, too. Existing
tests have been adjusted, and all tests are passed.
In order to show the benefits heights may have, the log cmd is changed in two
steps. First, it now always shows revisions in toposorted order (which was
not the case before). Because we use complex heights, the logged revisions
are clustered, ie. a linear chain of revisions is kept together as long as
possible. Second, having a fast way of doing a repeated toposort for a set of
revisions, we can exploit the markings present in the rosters to heavily
speed-up backwards restricted log. The idea is to only look at interesting
revisions, ie. marked ones and their parents. This way we can omit a lot of
roster parsing.
Additionally, the toposort command now uses heights, of course, resulting in a
speed-up for small to mid-size sets of revisions (and no decrease in speed
for sorting big sets). A method get_cached_heights() to get all heights at
once (similar to get_cached_ancestry()) might be useful for big sets, to
minimize communication with the database.
What needs to be done? Well, there should be a test in db check for everything
we put in the database, this is missing for heights. Probably, some testsuite
tests should be added, too.
Comments are welcome!
- Thomas
heights.diff.bz2
Description: BZip2 compressed data
- [Monotone-devel] [patch] heights,
Thomas Moschny <=