subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryon Winger <>
Subject Re: Splitting out project from repo
Date Tue, 02 Apr 2013 21:32:05 GMT
I am going through a similar process myself and have some questions about 
your concerns. I'm not trying to rock the boat, just looking fo clarity on 
a few
For perspective, I am working with around 300 individual projects
in a 70+ Gb repository containing over 300k revisions.

> If I understand correctly, you manually retrieve each version where 
> the given path/project has changed in any way to afterwards dump those 
> revisions. Why is this better/faster than using svndumpfilter with 
> specifying an include path, but without the need to post process the 
> dump files? 

I personally don't see the advantage to waiting around for svnadmin dump 

to process every unrelated revision. For one project, I am only concerned 

with about 200 revisions, spread out over 210k unrelated revisions.


# This example took around 8 hours:

svnadmin dump /path/to/master | svndumpfilter --drop-empty-revs \
--re-number-revs include $PROJECT > $PROJECT.dump

# However, when I run this on the same project:

for rev in `svn log -r0:HEAD file:///path/to/master/$PROJECT | egrep \

"^r[0-9]+ |" | cut -d " " -f1`; do

   svnadmin dump --incremental -r ${rev:1} /path/to/master | svndumpfilter \

                                             include $PROJECT >> 



… I can have a usable dump file in under 30 seconds. I realize this will 

longer for larger projects, but I think it makes my point. ‘svnadmin dump’ 

still creating a full dump stream for each revision before svndumpfilter 

that revision to decide to keep it or not.


> Are you sure your approach doesn't need other paths 
> from the repo, e.g. other source paths from copy operations for 
> projects or stuff like that? 

I absolutely agree with this checking for this. You can’t successfully pull 

a single path using svnadmin dump / svndumpfilter if there are copies from 

location outside of whatever you are filtering for.


I did notice that using svnrdump pointing to url/project seems to get 

around the outside-copy-sources issue, but I think that’s another 

discussion altogether.


> > svnadmin dump $repo --quiet -r $rev --incremental >> $project.$rev.bak 
> Adding to revision files with >> should be impossible in your 
> approach. 

Are you saying that appending to an existing dump file in general is a 

problem or just with all of his node-path processing? I have had no 

trouble appending to existing dump files.



Bryon Winger

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message