monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Re: How new-style codeville merge works


From: Zack Weinberg
Subject: Re: [Monotone-devel] Re: How new-style codeville merge works
Date: Sun, 08 May 2005 11:00:51 -0700
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

Nathan Myers <address@hidden> writes:

> On Sat, May 07, 2005 at 09:29:17PM -0700, Zack Weinberg wrote:
>> I think it would be possible to make [the second half of an s.file]
>> append-only too, but that would probably hurt checkout performance.
>
> ... unless it started with the file position of the start of (the
> rest of the) metadata, or with just the fixed-size part of the 
> metadata, or with just the metadata that was already known the 
> last time the file had to be rewritten from the beginning, and
> the rest appended incrementally.  

I don't understand what you are trying to say here.

The second half of an s.file - the database of lines - generally has
to be rewritten from scratch whenever the file content changes.  This
would remain true even if it were separated from the metadata.  It
happens because the second half of an s.file looks something like this

version 1 of line 1
version 2 of line 1
version 1 of line 2
version 2 of line 2
...

so whenever a line is inserted, it shoves everything after that point
down one.  I wish there were a good academic paper on weave format to
refer you to, but there isn't.  (The Rothkind paper on SCCS fails to
explain it very well.)

My suspicion is that you could re-sort the on-disk weave by version
number, like so

version 1 of line 1
version 1 of line 2
version 1 of line 3
...
version 2 of line 2
version 2 of line 4
...
version 3 of line 12

making it append-only, without having to add much information to the
metadata half of the file.  However, operating on this would be more
complicated, hence slower.  It's possible that it would be okay
performance-wise to read the whole file, rearrange it into the
conventional format in memory, then operate.  It's also possible that
we don't mind rewriting this half of the file from scratch every time,
e.g. because it's stored under compression and therefore gets
rewritten from scratch every time anyway.

zw




reply via email to

[Prev in Thread] Current Thread [Next in Thread]