subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Schaefer <joe_schae...@yahoo.com>
Subject Re: eliminating sequential bottlenecks for huge commit and merge ops
Date Thu, 05 Jan 2012 00:20:40 GMT
>________________________________
> From: Greg Stein <gstein@gmail.com>
>To: Joe Schaefer <joe_schaefer@yahoo.com> 
>Cc: dev@subversion.apache.org 
>Sent: Wednesday, January 4, 2012 7:08 PM
>Subject: Re: eliminating sequential bottlenecks for huge commit and merge ops
> 
>
>
>On Jan 4, 2012 1:34 PM, "Joe Schaefer" <joe_schaefer@yahoo.com> wrote:
>>
>> As Daniel mentioned to me on irc, subversion doesn't use threading
>> internally, so things like client side commit processing and merge
>> operations are done one file at at time IIUC.
>>
>> Over in the openoffice podling we have a use-case for a 9GB working copy
>> that regularly sees churn on each file in the tree.  commit and merge
>> operations for such changes take upwards of 20min, and I'm wondering
>> if there's anything we could do here to reduce that processing time
>> by 2x or better by threading the per-dir processing somehow.
>>
>> Thoughts?
> We've always taken the position that the amount of effort or size of
> delta/data is proportional to the size of the change. If you change all
> of a 9Gb working copy, then you should expect svn to take a good chunk
> of time and space.
> IOW, stop doing that :-) 
> That said, even if we were desirous of "fixing" this(*), we would have 

> a hard time doing it using threads. The Subversion client is pretty solidly
> single-threaded. We take no precautions for operation in a multi-threaded app.

>

> Cheers,
> -g
> (*) I'd be interested in what they are doing. Is this a use case we might see
> elsewhere? Or is this something silly they are doing, that would not be seen elsewhere?


They're using the ASF CMS to manage the www.openoffice.org website, which is full
of 10 years worth of accumulated legacy spanning 50 or so different natural languages.
The CMS is "too slow" during commits to template files or such which change
the generated html content of virtually every file on the site.

There are 2 ways I could mitigate this issue with them if subversion isn't interested
in working on this use case:

1) convert the templating system to use SSI, which would eliminate most of the
sledgehammer type commits.


2) deploy the CMS on an SSD backed system.


FWIW (2) is scheduled to happen in the not too distant future anyway, and I personally
don't want to encourage the use of SSI with the CMS even for oddball situations
like this one.


Mime
View raw message