Return-Path: X-Original-To: apmail-subversion-users-archive@minotaur.apache.org Delivered-To: apmail-subversion-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C79E3118F9 for ; Wed, 9 Apr 2014 18:43:05 +0000 (UTC) Received: (qmail 67495 invoked by uid 500); 9 Apr 2014 18:43:00 -0000 Delivered-To: apmail-subversion-users-archive@subversion.apache.org Received: (qmail 67465 invoked by uid 500); 9 Apr 2014 18:42:57 -0000 Mailing-List: contact users-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list users@subversion.apache.org Received: (qmail 67385 invoked by uid 99); 9 Apr 2014 18:42:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Apr 2014 18:42:52 +0000 X-ASF-Spam-Status: No, hits=1.0 required=5.0 tests=SPF_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of hc528@poolhem.se does not designate 79.170.43.33 as permitted sender) Received: from [79.170.43.33] (HELO mailscan1.extendcp.co.uk) (79.170.43.33) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Apr 2014 18:42:47 +0000 Received: from lb1.hi.local ([10.0.1.197] helo=mailscan2.extendcp.co.uk) by mailscan-g67.hi.local with esmtp (Exim 4.80.1) (envelope-from ) id 1WXxS5-0005mz-Pa; Wed, 09 Apr 2014 19:42:25 +0100 Received: from lb1.hi.local ([10.0.1.197] helo=mail50.extendcp.co.uk) by mailscan2.extendcp.co.uk with esmtps (UNKNOWN:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.80.1) (envelope-from ) id 1WXxS5-0004zD-HS; Wed, 09 Apr 2014 19:42:25 +0100 Received: from 133.90.227.87.static.ehn.siw.siwnet.net ([87.227.90.133] helo=balrog.lkp.se) by mail50.extendcp.com with esmtpa (Exim 4.80.1) id 1WXxS4-0001Ht-TG; Wed, 09 Apr 2014 19:42:25 +0100 Date: Wed, 9 Apr 2014 20:42:24 +0200 From: Henrik Carlqvist To: Philip Martin Cc: hc528@poolhem.se, users@subversion.apache.org Subject: Re: svn2cvsgraph, how to best handle merges? Message-Id: <20140409204224.1da48911.hc528@poolhem.se> In-Reply-To: <8761n0bvd9.fsf@ntlworld.com> References: <20131204201731.37b912aa.hc528@poolhem.se> <20131204205022.20a5d486.hc528@poolhem.se> <87iov4qqm4.fsf@ntlworld.com> <20131204225007.1595ec43.hc528@poolhem.se> <20140326201540.605c3d0b.hc94@poolhem.se> <8761n0bvd9.fsf@ntlworld.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Authenticated-As: henrik@poolhem.se X-Extend-Src: mailout X-Virus-Checked: Checked by ClamAV on apache.org On Wed, 26 Mar 2014 19:41:38 +0000 Philip Martin wrote: > Henrik Carlqvist writes: > > Would people hosting public svn repositories think that it would be > > nice if some people using my tool would make one svn connection for > > each revision in the repository? > > It's a user problem as well since making a request per revision doesn't > scale well and will be very slow for large projects. As the merge information you get from "svn log -g" is somewhat recursive it seems as if time grows exponentially with the number of revisions (or maybe rather with the number of merges). However, my own test version of svn2cvsgraph which calls svn once for each revision does a pclose on the svn call after reading the first log entry and the second log entry (which might be a merge). With such a solution time grows linear with the number of revisions, but svn older than 1.7 will give some "svn: Write error: Broken pipe" to stderr. I did a benchmark comparing a box running Slackware 14.1 with svn 1.7.16 and another box running Slackware 13.1 with svn 1.6.16. On these machines I tested 3 version of svn2cvsgraph: svn2cvsgraph 1.2: makes a single call to "svn log -q -g" on the subversion repository root. svn2cvsgraph 2.0: makes one call to "svn log -q -g" for each branch (and trunk) svn2cvsgraph 2.1beta: makes one call to "svn log -q -g" for each revision, the call is aborted with pclose to avoid wasting time on redundant information. The benchmarks were run on a test subversion repository which was read from a 2.9 GB big subversion dump file of an actual project repository. The repository contains 13570 revisions and 160 branches. 206 merges has been logged into the repository since the repository was upgraded to version 1.5 of subversion. The test repository was accessed as file:/// on an NFS server. Times were measured with the /usr/bin/time command. These are the results: subversion svn2cvsgraph time result 1.7.16 1.2 6:13.70elapsed 17%CPU No merges found 1.7.16 2.1beta 7:20.73elapsed 55%CPU All merges found 1.7.16 2.0 13:49.48elapsed 45%CPU 23 merges lost 1.6.16 2.1beta 52:53.63elapsed 81%CPU All merges found 1.6.16 1.2 134:55:22elapsed 41%CPU All merges found 1.6.16 2.0 135:14:04elapsed 41%CPU All merges found Subversion 1.7.16 seems a lot faster than 1.6.16. Even though the tests were run on different machines and the Slackware 14.1 machines is slightly faster than the Slackware 13.1 machine I think that most of the difference is thanks to that 1.7.16 gives less recursive merge information to wade through. No merges are found when only doing "svn log -q -g" on the repository root with version 1.7.16. This is expected behavior as the behavior of "svn log -g" changed with version 1.6.17. 23 merges were lost with "svn log -q -g" on every branch with 1.7.16, this is most likely because of issue 4477. Doing "svn log -q -g" for each revision and abort the output with pclose is the fastest way to get correct results for both version 1.6.16 and 1.7.16. However, this is assuming that the repository is accessed with file://. Previously I have instead been using svn+ssh:// with svn 1.6.16 and with one call for each branch or only for the repository root that takes about 24 hours (compared with about 135 hours above). However using svn+ssh:// instead of file:// when doing one call for each revision would be a lot slower. regards Henrik