monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] [ANNOUNCE] monotone 0.19


From: Nathan Myers
Subject: Re: [Monotone-devel] [ANNOUNCE] monotone 0.19
Date: Fri, 6 May 2005 19:52:04 -0700
User-agent: Mutt/1.3.28i

On Fri, May 06, 2005 at 10:10:04AM -0700, Nathaniel Smith wrote:
> On Fri, May 06, 2005 at 09:13:14AM -0700, Nathan Myers wrote:
> > On Fri, May 06, 2005 at 01:15:29PM +0200, Nico -telmich- Schottelius wrote:
> > > 
> > > - You speak about compression/decompression, are there in general
> > >   processes, which can be optimized? I don't mean algorithms
> > >   or something like that.
> 
> ... there is both delta storage and
> compression going on, and they interact somewhat differently in
> netsync and the database; plus, as you note, hash checking.  At the
> moment, IIRC, the sequence on the server is:
>   -- server constructs plaintext of requested version, either by
>      finding it in its plaintext cache, by uncompressing a compressed
>      full version in the db, or by uncompressing some sequence of
>      deltas and applying them to another plaintext.
>   -- server hashes this plaintext, to make sure it hasn't been
>      corrupted
>   -- depending on what the client wanted, the server either:
>      -- compresses this plaintext and sends it
>      -- does an xdelta between this plaintext and the other plaintext
>         the client mentioned, compresses the resulting xdelta, and
>         send it
> I don't recall exactly what the sequence on the client is, but it's
> similarly convoluted -- there might be some opportunities to
> streamline and avoid a passing compress/uncompress, but it's hard to
> get _too_ far, because of the different needs of local storage and
> network usage.  Keeping a safe design is also a bit of a factor; e.g.,
> one might say "pff, the server can skip checking the hash, let the
> client do it", but it's a bit tricky, because the data verification
> logic is very low level.  It has no idea what the data will be used
> for.  In the broader context of trusting monotone to work right and
> never accidentally give bad data, that's a good thing...

It looks to me like the netsync protocol needs a way to ask for much 
lower-level constructs: basically, "Tell me the hashes of all the 
blobs you have that I will need to construct these versions; I'll 
tell you which of those to send.  Don't bother hashing, because I 
have to do that anyway.  Don't bother with any plaintext, just give 
me everything raw and compressed, straight from the database."

On receipt, the client deflates each blob and verifies its hash, but 
doesn't bother constructing any final plaintext versions.  It just 
stuffs the verified, compressed blobs into its own database just as
they came over the wire.  When a user _asks_ for one of those versions
it just got the pieces for, it can construct plaintext, check hashes
again, and whatnot, as usual.  Probably they will never ask for most 
of the intermediate versions it picked up, so there's no point in 
fooling with them during netsync.

Nathan Myers
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]