jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enrique Medina Montenegro <e.medin...@gmail.com>
Subject Re: Jackrabbit & Performance
Date Thu, 21 Nov 2013 10:58:09 GMT
Some more Bundle Cache statistics so far:

11:53:29,878  INFO cachename=iptoolBundleCache[ConcurrentCache@4ecf2758],
elements=1936, usedmemorykb=137540, maxmemorykb=524288, access=7747,
miss=1936

11:55:51,190  INFO cachename=iptoolBundleCache[ConcurrentCache@4ecf2758],
elements=1936, usedmemorykb=138177, maxmemorykb=524288, access=15367,
miss=1936

11:57:17,141  INFO cachename=iptoolBundleCache[ConcurrentCache@4ecf2758],
elements=1936, usedmemorykb=138495, maxmemorykb=524288, access=16256,
miss=1936

I presume those "miss" ones are just being accumulated...

On Thu, Nov 21, 2013 at 11:33 AM, Enrique Medina Montenegro <
e.medina.m@gmail.com> wrote:

> First, thanks to everyone for such helpful hints.
>
> Now looks like I'm progressing further with the segment of nodes; I can
> see that my bottleneck is actually in the "session.save()" when writing
> against a DB, because when writing against the file system seems to behave
> quite fast.
>
> And theoretically the Bundle Cache doesn't get exhausted even with the
> default 8K:
>
> 11:24:46,695  INFO cachename=iptoolBundleCache[ConcurrentCache@70911adf],
> elements=93, usedmemorykb=552, maxmemorykb=8192, access=254, miss=93
>
> I'll continue testing and share my final results with you.
>
> On Thu, Nov 21, 2013 at 12:16 AM, Ron Wheeler <
> rwheeler@artifact-software.com> wrote:
>
>> Have you sorted the marks?
>> This way you should only be switching top nodes every 1000 records and
>> sitting at /marks/XXX and adding a thousand nodes here before moving to
>> /marks/XXX+1 and adding a thousand there.
>>
>> Ron
>>
>>
>>
>>
>>
>>  On 20/11/2013 2:39 PM, Enrique Medina Montenegro wrote:
>>
>>> Bertrand,
>>>
>>> Your algorithm is exactly the approach I followed, but I noticed a
>>> decrease
>>> in performance as the import was progressing, with response times to just
>>> lookup the exact path (i.e. session.getNode("/marks/XXX/YYY")) above 2
>>> seconds, even when calling Session.save() every 1000 or 500 or 100
>>> records...
>>>
>>> Using Jackrabbit 2.7.0 btw, because it's the only one working with Spring
>>> Modules for JCR 0.8b
>>>
>>> Salu2,
>>> Quique.
>>>
>>>
>>> On Wed, Nov 20, 2013 at 8:34 PM, Bertrand Delacretaz <
>>> bdelacretaz@apache.org
>>>
>>>> wrote:
>>>> Hi,
>>>>
>>>> On Wed, Nov 20, 2013 at 7:39 PM, Enrique Medina Montenegro
>>>> <e.medina.m@gmail.com> wrote:
>>>>
>>>>> ...at the practical level,
>>>>> when I dump the 1M marks from the DB into JCR, for each an every "mark"
>>>>>
>>>> it
>>>>
>>>>> has to lookup the path in the tree where to ultimately store the
>>>>> "mark",
>>>>> and this lookup starts to take orders of seconds as the tree structure
>>>>> grows, making the full extraction process from the DB too slow for our
>>>>> requirements....
>>>>>
>>>> If import according to the following scenario the performance should be
>>>> linear:
>>>>
>>>> for each DB record
>>>>    compute path of JCR node
>>>>    for each level of that path (below storage root)
>>>>      create node if not created yet
>>>>      set properties if on the data node at the end of the path
>>>>
>>>> and you probably want to call Session.save() every N records (N=1000
>>>> maybe)
>>>>
>>>> -Bertrand
>>>>
>>>>
>>
>> --
>> Ron Wheeler
>> President
>> Artifact Software Inc
>> email: rwheeler@artifact-software.com
>> skype: ronaldmwheeler
>> phone: 866-970-2435, ext 102
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message