monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] git fast-export


From: Derek Scherger
Subject: [Monotone-devel] git fast-export
Date: Sun, 4 Jan 2009 23:09:35 -0700

I've spent a bit of holiday hacking time working on a git_export command for monotone, more as an experiment than anything else. I've committed the result to net.venge.monotone.fast-export for people to have a look at. There's probably not much preventing this from landing on mainline, other than some documentation and possibly tests. Although I'm not really sure how we would want to go about testing it beyond what I've already done. The fun part about a command like this is that I expect most users of it would have some expectation of being their own testers in terms of verifying their conversions and such.

This successfully (I think) converts the entire monotone database with 276 branches (more or less what you get when you pull '*' from monotone.ca) to a git repository.Here's some details on the conversion:

exported monotone database
- 174MB in size
- 276 branches
- 127 tags (with one duplicate name monotone-viz-1.0.1-1
- export time 83m42.134s (on a 2.0GHz pentium-m laptop)
- export file size 2.9GB
- 15245 revisions exported

imported git repository
- 719MB in size (before being repacked)
- import time 23m15.463s
- repack -adf time 3m14.385s
- packed repository size 60MB
- 277 branches (the extra one is "master")
- 126 tags (missing the duplicate above)

Three exported branch names "net.prjek:tester", "net.prjet:tester/drop-for-propagate" and "prjek.net:tester" where changed (with sed) during the import process because git does not allow colon's (and various other characters) in branch/ref names. I simply changed ":" and "/" in these names to "." although the "/" should have worked it did cause an error of some sort.

The conversion was verified by checking out each of the 276 branches and 126 tags from both git and mtn and comparing the resulting workspaces. The script I used to do this verification was a bit dumb and failed to checkout a few revisions so these weren't compared. Using only the branch name failed in some cases because there were multiple heads and using only a tag name failed in some cases because the tagged revisions had no branch certs. All of the branches and tags that did checkout were identical according to diff -qr so I'm reasonably confident that the new exporter basically works.

I suspect that the various other git fast-import conversion scripts that exist for monotone are probably slower and less robust than this implementation (unless they work similarly from rosters) which uses the monotone internals to do the work. I spent a bit of time initially trying to export revisions using the revision data structures but this didn't work very well. Git only deals with files and trying to order a mix of renames of directories and files from monotone correctly from revisions was difficult. Ultimately I didn't use the revision data structures at all but built up a similar files-only based revision representation by comparing rosters. Much like what is done for make_cset, but ignoring directories and producing only file deletions, renames and additions. This works much better, correctly handles pivot_root and a few other odd things that working with revisions proved difficult.

This exporter does not (yet) handle all rename ordering issues that are possible. For example <rename a b> followed by <rename b c> will probably fail on import unless it is executed as <rename b c> followed by <rename a b>. Similarly <rename a b> followed by <rename b a> which is indeed possible, will probably fail on import and requires the introduction of a third temporary file. These problems can be fixed in the exporter and can also be fixed in the exported data by re-ordering renames as required.

WARNING: Please don't bet your life on this implementation! If you do use it to convert a repository you must do careful verification of the converted results. WORKSFORME is the only assurance I can make.

This feels a bit like throwing in the proverbial towel and I hope this doesn't elicit any ill-will from the current monotone crowd. I'm not really planning on converting my personal stuff from monotone any time soon but knowing it can be done without losing information is nice. I'm still happy to contribute to monotone but with 2 small kids my free/hacking time is pretty limited.

Cheers,
Derek


reply via email to

[Prev in Thread] Current Thread [Next in Thread]