subversion-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nico Kadel-Garcia <>
Subject Re: Problem with svndumpfilter
Date Thu, 07 Jun 2018 12:05:10 GMT
On Thu, Jun 7, 2018 at 3:04 AM Stefan Sperling <> wrote:
> On Wed, Jun 06, 2018 at 03:12:20PM -0400, Alfred von Campe wrote:
> > I’m trying to remove two sensitive directories from a repo so we can have a 3rd
party work on it.  I first dumped the entire repo, and now I’m trying to remove two directories
from one particular branch.  But svndumpfilter keeps failing as follows:
> >
> > $ svndumpfilter exclude branches/develop/dir1 branches/develop/dir2  < repo.dump
> repo-nodir12.dump
> > svndumpfilter: E200003: Invalid copy source path '/branches/develop/dir2'
> >
> > I’ve tried this both from a full incremental dump of the repo as well as a non-incremental
dump of the repo starting from the revision that branches/develop was created.  It always
fails after the exact same revision.
> >
> > Is there anything I can do to work around this issue?
> >
> > Alfred
> Yes, you can update to 1.10 and use svnadmin dump --exclude
> instead of using svndumpfilter.
> See
> An alternative that works with earlier releases is to set up svnsync
> replication and configure authz access rules for the sync user which
> forbid read access to the paths you want to exclude. svnsync will deal
> with missing copy sources by translating copies into additions.

There is also a fairly nasty and somewhat hazardous trick I've used
effectively a few times to clean up a historically messy SVN layout.
Import it to git with git svn, trim debris branches and tags and
out-of-band content ruthlessly, use "git gc --aggressive" to flush
loose objects or branches *from the history*, then export that with
git svn into a new Subversion repository.  There are risks: git
doesn't handle keywords the same way Subversion does, for example, so
the transfer needs to be reviewed cautiously for svn:keywords and
svn:ignore and svn:eol handling. But when you've a messy Subversion
layout where people dumped oddly named branches or parts of branches
in weird locations, or embedded bulky binary files accidentally and
left copies scattered around the history, it can be an invaluable
cleanup tool. It also doesn't require access to the Subversion server
to run "svnadmin dump", and it can be updated from the current running
Subversion master.

Part of the key is the use of the "git gc --aggressive" tool to flush
history of pruned content. Yes, this flushes history, and is
considered a sin, Sin, ***SIN*** for those who consider a complete and
pristine history of the entire source tree the whole point of a source
control system. But in practice..... most branches and tags are
pointless after long enough. and it only takes a few accidental
commits of bulky binaries or of inappropriately imported content to
clutter and even legally encumber a source control system. Like
pruning any history, it needs to be done cautiously or important
material can be lost..

View raw message