Return-Path: X-Original-To: apmail-subversion-users-archive@minotaur.apache.org Delivered-To: apmail-subversion-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 24C776FA7 for ; Fri, 17 Jun 2011 00:08:57 +0000 (UTC) Received: (qmail 4724 invoked by uid 500); 17 Jun 2011 00:08:56 -0000 Delivered-To: apmail-subversion-users-archive@subversion.apache.org Received: (qmail 4704 invoked by uid 500); 17 Jun 2011 00:08:56 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 4697 invoked by uid 99); 17 Jun 2011 00:08:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 00:08:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bema@dei.uc.pt designates 193.137.203.253 as permitted sender) Received: from [193.137.203.253] (HELO smtp.dei.uc.pt) (193.137.203.253) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 00:08:48 +0000 Received: from dhcp-eucpii1-97.ci.uc.pt (dhcp-eucpii1-97.ci.uc.pt [193.136.206.97] (may be forged)) (authenticated bits=0) by smtp.dei.uc.pt (8.14.4/8.14.4) with ESMTP id p5H082RL001256 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 17 Jun 2011 01:08:20 +0100 Subject: Re: Problem Loading Huge Repository Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-75--750380778 From: Bruno Antunes In-Reply-To: Date: Fri, 17 Jun 2011 01:08:20 +0100 Cc: users@subversion.apache.org Message-Id: References: To: Geoff Hoffman X-Mailer: Apple Mail (2.1084) X-FCTUC-DEI-SIC-MailScanner-Information: Please contact helpdesk@dei.uc.pt for more information X-FCTUC-DEI-SIC-MailScanner-ID: p5H082RL001256 X-FCTUC-DEI-SIC-MailScanner: Found to be clean X-FCTUC-DEI-SIC-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-1.249, required 3.252, autolearn=not spam, ALL_TRUSTED -1.00, BAYES_00 -0.25, HTML_MESSAGE 0.00) X-FCTUC-DEI-SIC-MailScanner-From: bema@dei.uc.pt X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No --Apple-Mail-75--750380778 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On Jun 17, 2011, at 24:59 , Geoff Hoffman wrote: >=20 >=20 > On Thu, Jun 16, 2011 at 4:05 PM, Bruno Antunes wrote: > Hi, >=20 > As part of the work of my PhD thesis I need to load the ASF Subversion = repository into my own local repository in order to mine and extract = information from the repository without overloading the ASF servers. >=20 > I have downloaded the repository dump and started loading it into my = own repository. But the repository is huge (~45GB), and loading it using = 'svnadmin load' will take me days (~15). >=20 > I tried 'svndumpfilter' to filter out some projects but I get the = error 'svndumpfilter: Unsupported dumpfile version: 3'. I'm using = 'svndumpfilter' version 1.6.12. Is there any way to overcome this error? >=20 > Do you know any faster way to load the dump file or to filter out some = projects/revisions so I can speed up the process? >=20 > Thank you in advance. >=20 > Best regards, > Bruno Antunes >=20 >=20 >=20 > Just a thought... Do you need the revision history or only the current = (head) revision? >=20 > Guessing if you do not need the revision history then it will be much = smaller and faster to svn export their-stuff -r HEAD I need the entire revision history, because I need to extract historical = information from the repository. But I could use the revision history from specific projects only, I = don't need the entire repository, which contains all the ASF projects. = The problem is I can filter these projects in the dump file. Best regards, Bruno Antunes= --Apple-Mail-75--750380778 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii
On Jun 17, 2011, at 24:59 , Geoff Hoffman wrote:



On Thu, Jun 16, 2011 at 4:05 PM, Bruno Antunes <bema@dei.uc.pt> wrote:
Hi,

As part of the work of my PhD thesis I need to load the ASF Subversion repository into my own local repository in order to mine and extract information from the repository without overloading the ASF servers.

I have downloaded the repository dump and started loading it into my own repository. But the repository is huge (~45GB), and loading it using 'svnadmin load' will take me days (~15).

I tried 'svndumpfilter' to filter out some projects but I get the error 'svndumpfilter: Unsupported dumpfile version: 3'. I'm using 'svndumpfilter' version 1.6.12. Is there any way to overcome this error?

Do you know any faster way to load the dump file or to filter out some projects/revisions so I can speed up the process?

Thank you in advance.

Best regards,
Bruno Antunes



Just a thought... Do you need the revision history or only the current (head) revision?

Guessing if you do not need the revision history then it will be much smaller and  faster to svn export their-stuff -r HEAD

I need the entire revision history, because I need to extract historical information from the repository.
But I could use the revision history from specific projects only, I don't need the entire repository, which contains all the ASF projects. The problem is I can filter these projects in the dump file.

Best regards,
Bruno Antunes
--Apple-Mail-75--750380778--