monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] CVS sync works (for me)


From: Nathaniel Smith
Subject: Re: [Monotone-devel] CVS sync works (for me)
Date: Mon, 21 Feb 2005 10:36:40 -0800
User-agent: Mutt/1.5.6+20040907i

On Mon, Feb 21, 2005 at 03:08:53PM +0100, Christof Petig wrote:
> you see that for larger projects only a few of the files change per
> revision (edge). But such a cert can easily grow to 1817 lines and 31.5k
> bytes. If you multiply that with the amount of edges (3300 for this
> project) you get about 90MB! So I decided to store only the changing
> files like:
> 
> cvs.midgard.berlios.de:/cvsroot/midgard/midgard   (repository)
> +ebf337072571135affe49b5da42b7342ddba0852         (last revision)
> - dir/deleted
> 1.5 dir/changed
> 1.1 dir/added

Okay, I think this is on the same track as what I meant.  What I was
trying to point out was that if you know just one change that happened
in a single commit -- like, dir/changed went from 1.4 to 1.5 -- then
that should be enough to identify that entire commit.  (After all,
dir/changed only goes 1.4 -> 1.5 once, ever.)  Basically by looking up
when that happened, noting the time/changelog/etc., and then using
that to assemble the other changes since then.

> This should get the size down to a reasonable amount and is readable
> enough to be able to verify by sight. [That's actually the reason I
> refrained from reverse diffing (store the last cert in full length and
> recode older ones as time-backwards-diffs)]. I have to read and process
> all the certs anyway.

Hmm, that doesn't sound very scalable.  Why do you have to do that?

> The reason I need older certs as well is to enable the correct rooting
> of branches (once supported).

Right.

> PS: Ever thought about putting an index on revision_certs.id? Perhaps
> this speeds up correctness verification (just guessing) and since data
> retrieval is more likely than data modification I cannot see drawbacks.
> Similar might apply to other large (number of rows) tables as well.
> [e.g. manifest_deltas.id, file_deltas.id]

Hmm, might be a good idea -- do you have some test case where it
matters?  Maybe "log" or something?

If sqlite is like other rdbms's, I'd think that we already have
indices on {manifest,file}_deltas.id, because usually unique
constraints generate implicit indicies.  Could be wrong, though.

-- Nathaniel

-- 
"On arrival in my ward I was immediately served with lunch. `This is
what you ordered yesterday.' I pointed out that I had just arrived,
only to be told: `This is what your bed ordered.'"
  -- Letter to the Editor, The Times, September 2000




reply via email to

[Prev in Thread] Current Thread [Next in Thread]