subversion-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Foad <julian.f...@wandisco.com>
Subject Re: svn commit: r1040663 - in /subversion/trunk/subversion/libsvn_wc: wc-queries.sql wc_db.c
Date Wed, 01 Dec 2010 13:06:04 GMT
Daniel Shahaf wrote:
> So we loop over the remaining sha1's and remove each of them...
> I wonder if there is room for further optimization here?  e.g., does
> this prepare/reset the statement just once, or once per iteration?

Each iteration of this loop prepares, uses and resets a SQL statement,
and also removes a pristine file from disk.  So yes there is room for
further optimization of the SQL part of that.

The main concern I was addressing was that the previous method was
*quadratic* in the total number of pristines in the store, because for
each one in the store it would scan the NODES and ACTUAL_NODE tables
looking for a reference to it.  I had noticed that even a no-op cleanup
took a very long time on a large WC.  It will help if I show some real
timings.

Wall clock times for "svn cleanup" on a clean checkout of
^/subversion/branches@1040943 on my Linux system.

  r1040662 build: first time = 15 minutes, second = 14.8 minutes.

  r1040663 build: first time = 4.4s, best of many repetitions = 0.7s.

Now the algorithm is only linear time, which is a *huge* win.  A
'cleanup' operation doesn't need to be blisteringly fast, so I don't
think it needs more optimisation.

I've edited the log message to clarify the main point, and to point out
the big-WC timing improvement.

- Julian


# r1040662 build
$ time ~/build/subversion-c/subversion/svn/svn cleanup branches/
real	15m4.962s
user	9m0.306s
sys	6m3.967s

# r1040663 build
$ time ~/build/subversion-c/subversion/svn/svn cleanup branches/
real	0m0.708s
user	0m0.436s
sys	0m0.212s




Mime
View raw message