gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz] 14th


From: Benja Fallenstein
Subject: [Gzz] 14th
Date: Thu, 15 Aug 2002 12:12:58 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020615 Debian/1.0.0-3

Got some interesting things finished yesterday.

As mentioned in my other mail, I've implemented a YAMLVersionFormat for SliceVersions, in Jython, using a Python YAML library for parsing. While this may be slowish, I do think that with the experience gained there and in the header parsing util, I can easily re-write the thing in Java, making it much more efficient (operating directly on input/output streams, which the Jython version doesn't currently).

With the current saving infrastructure, there are three components working together to save a slice (or other document):
- A Version implementation (SliceVersion in the case of slices)
- A VersionFormat implementation (YAMLVersionFormat or SerializedMediaVersionFormat currently)
- A Filer implementation (MediaserverFiler currently)

The Versions and corresponding Version.Diffs represent the data in a document; the VersionFormat represents a way to serialize a subset of all Version implementations (YAMLVersionFormat can serialize only SliceVersions, SerializeMediaVersionFormat can serialize only java-serializable Versions); the Filer stores serialized representations somehow.

The other big thing that got finished yesterday is a new way of storing diffs: - Assumption: If we have a Version, it is always serialized to the same canonical stream of bytes. - This stream of bytes can be made into a block and given a mediaserver id. - The mediaserver pointers always point not to a diff, but to a version('s id). - The diffs know the versions they are between ("This is a diff between version X and Y"). - There is functionality in mediaserver to index and find "all diffs from/to version X".
- To load a version, we first try to find that block in mediaserver--
- --if it's not there, we try to load a diff to that version and the version the diff is from, and apply the diff to the version,
- recreating the version and hashing it to make sure it is the correct one.

This way, we can e.g. delete old versions without breaking the chain of diffs. It may not be top efficient currently, but it makes it easy to do backward diffing which should help a great lot.

Of course, there should also be a more conventional diff-saving Filer (I'll be doing that soon as a proof-of-concept), but I think this holds more promise.

-b.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]