monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] [ANNOUNCE] monotone 0.19


From: Nathaniel Smith
Subject: Re: [Monotone-devel] [ANNOUNCE] monotone 0.19
Date: Fri, 6 May 2005 10:10:04 -0700
User-agent: Mutt/1.5.9i

On Fri, May 06, 2005 at 09:13:14AM -0700, Nathan Myers wrote:
> On Fri, May 06, 2005 at 01:15:29PM +0200, Nico -telmich- Schottelius wrote:
> > 
> > - You speak about compression/decompression, are there in general
> >   processes, which can be optimized? I don't mean algorithms
> >   or something like that.
> 
> In particular, must it be compressed/decompressed at all?  Can't
> the server send it compressed?  If it must be decompressed to check 

The server does send it compressed, FWIW.

> the hash, might the compressed image not be retained, to push into 
> the database without need to compress it again?  

It's not quite that simple, because there is both delta storage and
compression going on, and they interact somewhat differently in
netsync and the database; plus, as you note, hash checking.  At the
moment, IIRC, the sequence on the server is:
  -- server constructs plaintext of requested version, either by
     finding it in its plaintext cache, by uncompressing a compressed
     full version in the db, or by uncompressing some sequence of
     deltas and applying them to another plaintext.
  -- server hashes this plaintext, to make sure it hasn't been
     corrupted
  -- depending on what the client wanted, the server either:
     -- compresses this plaintext and sends it
     -- does an xdelta between this plaintext and the other plaintext
        the client mentioned, compresses the resulting xdelta, and
        send it
I don't recall exactly what the sequence on the client is, but it's
similarly convoluted -- there might be some opportunities to
streamline and avoid a passing compress/uncompress, but it's hard to
get _too_ far, because of the different needs of local storage and
network usage.  Keeping a safe design is also a bit of a factor; e.g.,
one might say "pff, the server can skip checking the hash, let the
client do it", but it's a bit tricky, because the data verification
logic is very low level.  It has no idea what the data will be used
for.  In the broader context of trusting monotone to work right and
never accidentally give bad data, that's a good thing...

> Anyway, is there any point to compressing, these days?  Maybe it 
> should be a local repository option.  Netsync would adapt to either.

There's definitely a point to compressing on the wire.  In the db it
might be a noticeable win (especially for high-traffic servers), but
then again, it might not -- the sanity checking has overwhelmed the
basic db traffic in CPU usage for a while, might change as we make
sanity checking faster.

If you're worried about db read/write speed, not doing delta
compression might be the first thing to try; you could do this right
now without any changes to the database format or the version fetching
logic, you'd just need to tweak the version storage logic not to make
the deltas in the first place.

-- Nathaniel

-- 
Eternity is very long, especially towards the end.
  -- Woody Allen




reply via email to

[Prev in Thread] Current Thread [Next in Thread]