jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject Re: Write performance for large child node lists
Date Tue, 22 May 2012 10:29:25 GMT


On 22.5.12 8:48, Jukka Zitting wrote:
> Hi,
>
> On Tue, May 22, 2012 at 12:00 AM, Michael Dürig<mduerig@apache.org>  wrote:
>> During my work on OAK-102 and OAK-106 I was eventually convinced by Jukka to
>> switch from using a change log to accumulate transient changes to an
>> approach where the change log is calculated from the resulting states
>> instead. Now this seems to bite already. Witing large child node lists seems
>> to be trouble some wrt. performance: since the comparison needs to iterate
>> over ever more child nodes, writing child nodes does not scale in constant
>> time any more but rather in time linear to the total number of child nodes.
>
> That's just because we're using the default diff algorithm from
> AbstractNodeStore. A different algorithm that leverages information
> about the actual data structure used to store the child node list
> should be able to do the diff in O(k) time where k is the number of
> changes.
>
> Do you have a test case for this? I can take a look at fixing the
> issue later today.

Have a look at RootImplTest.largeChildList() which adds 10000 child 
nodes flat. The test case is easily extended to demonstrate that adding 
child nodes leads to linear performance drop.

Michael

>
> BR,
>
> Jukka Zitting

Mime
View raw message