www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Santiago Gala <santiago.g...@gmail.com>
Subject Re: git for other open source projects
Date Thu, 28 Feb 2008 18:07:07 GMT

El jue, 28-02-2008 a las 12:52 +0200, Jukka Zitting escribió:
> Hi,
> 
> On Thu, Feb 28, 2008 at 12:36 PM, Santiago Gala <santiago.gala@gmail.com> wrote:
> >  El jue, 28-02-2008 a las 01:51 +0200, Jukka Zitting escribió:
> >  > What are the practical situations where git is better than
> >  > the alternatives?
> >
> >  It is damned fast. This is a substantial part of it.
> 
> Good point and IMHO an important non-functional feature.
> 
> I guess the main speedup comes from having a local copy of the
> repository. This will probably not be feasible on the whole ASF scale,
> but having an easy way to mirror the full revision history of a
> project would be really cool.
> 

Not only due to the repository being local. Both mercurial and git are
very optimized, mercurial in terms of data structures and access
patterns, git more in terms of low level design and simple data model. I
mean, bzr, monotone or darcs all have local repositories but they are
way slower, even slower than subversion for big repositories.

git only packs repositories on demand (git pack, now called git gc), and
it packs them a lot. Having a high "information density" makes its
memory footprint very small, thus making good use of cache. This adds to
the speed. Packing can often reduce 10x the size. After import a shindig
repository it is about 10megs, after gc it reduces to 1.1 meg (working
copy is 2.6 megs, repo+wc is 3.9 megs, a subversion checkout is 7.5
megs). I guess the number of small files inside .svn directories adds,
while the packed structure makes it way smaller.

In theory a svn checkout should be like wc_size*2, but here it is much
more than that. I suspect unused fragments of clusters in lots of small
files as culprits (I'm using du -sh to measure). 

$ (cd ~/newcode/shindig && find . -print | wc -l) #subversion checkout
1173
$ (cd ~/newcode/git-shindig && find . -print | wc -l) #git-svn repo+wc
296

so 4 times more files. shindig has not that much history still, it would
be interesting make some larger scale experiments, specially with deeper
histories, and see how it performs.


> I guess git-svn is closest to making that happen and it might be
> useful to spend time looking at the problems you mentioned.
> 
> I also recall seeing something similar (checkout including a local
> copy of relevant version history) being discussed on the Subversion
> mailing list some while ago. I'll see if I can dig it up, or perhaps
> someone closer to Subversion knows better.
> 

You probably mean svk, it uses part of subversion code to have a mixed
centralized-decentralized model. I don't know it that well, but I found
it more difficult to use than git. Re: ease of use, I find bazaar the
easiest, then mercurial a close second, then git, svk and darcs (for
different reasons). This is standalone. Re: subversion integration I'm
finding git more or less usable, hg slow and doesn't fold trunk,
branches/ and tags/ like git-svn does. I haven't got yet the theoretical
subversion integration of bzr to work.

> BR,
> 
> Jukka Zitting
-- 
Santiago Gala
http://memojo.com/~sgala/blog/


Mime
View raw message