From Paul Davis <>
Subject Re: git conversion (was: Re: dev Digest of: thread.17379)
Date Mon, 08 Aug 2011 19:23:37 GMT
On Mon, Aug 8, 2011 at 1:57 PM, Dustin Sallings <> wrote:
>        [sorry for screwing up the subject]
> On Aug 8, 2011, at 11:22 AM, Paul Davis wrote:
>> Nice! Any hints you have about validating SVN->Git conversions or
>> tooling would be greatly appreciated. I don't really have much other
>> than the obvious Graphviz plotting tool. Beyond that I don't have
>> anything other than getting each TLP to verify their own history.
>        I wrote a tool that would take two git trees that had no common history, but
were expected to converge on the same tree state and show a graphical diff as part of the
memcached conversion.  It produces output that looks like this:
>        As it is, it won't help you much if you're planning to move source around
*during* the conversion, but it does a good job of verifying that you didn't break anything
in rebasing, updating commit messages, changing authors, committers, etc...
>        Basically, you just need two refs in a single git repo (old-branch vs. rewritten-stuff)
and run "git tree-converge old-branch rewritten-stuff" and you get tons of html spewed at
>> I'm also not sure if it makes a difference, but the ASF SVN repo is
>> one huge monolithic thing, so it's a lot of project histories
>> intertwined which I'm looking forward to finding awesome conversion
>> bugs with.
>        The biggest problem I've had with such things is actually having svn be willing
to give up the history.  As long as you can get it out in any way at all, we can fix it.
 The worst case would be doing a complete reproduction of the svn history in a monolithic
git repo.  I can work with that.  It's likely unnecessary.
>        My experience with svn has never been good and I wouldn't call myself an expert
there, but if we can get the content out successfully, I can help you do all kinds of junk
with it.
> --
> dustin sallings

I'm mostly looking for tools to let people look at a Git history and
verify that it matches their SVN history. CouchDB's SVN to Git
migration is basically a test case for all of the ASF. Assuming it
goes well there will be other projects wanting to switch so I'm trying
to think ahead to what they might want to see.

Its always possible to get the data out, but the thing to realize is
that the ASF SVN repo is over 65GiB with over 1.1M commits. Brute
forcing the conversion is probably not the most sane approach if it
can be avoided.

Thanks for the input.

