jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Neale" <michael.ne...@gmail.com>
Subject Re: OutOfMemory - adding lots of nodes in one session
Date Fri, 01 Sep 2006 08:42:50 GMT
1:
yeah I use JProfiler - top of the charts with a bullet was:
org.apache.jackrabbit.util.WeakIdentityCollection$WeakRef (a ha ! that would
explain the performance slug when GC has to kick in late in the piece).
followed by:
org.apache.derby.impl.store.raw.data.StoredRecordHeader
and of course a whole lot of byte[].

I am using default everything (which means Derby) and no blobs whatsoever
(so all in the database).

2:
If I logout, and use fresh everything, it seems to continue fine (ie fast
enough pace), but I haven't really pushed it where I wanted to get it (10000
Child nodes).

Responding to Alexandru's email (hi alex, nice work on InfoQ if I remember
correctly ! I am a fan), it would seem that the Session keeps most in
memory, which I can understand.

I guess my problem is that I am trying to load up the system to test really
basically that it scales to the numbers that I know I need to scale to, but
I am having trouble getting the data in - bulk load wise. If I bump up the
memory, it certainly seems to hum along better, but if Session is keeping a
lot around, then this will have limits - there is no way to "clear" the
session ?

Perhaps I will explain what I am using JCR for (feel free to smack me down
if this is not what JCR and Jackrabbit are ever indended for):
I am storing "atomic business rules" (which means each node is a small
single business rule). The data on each node is very small. These nodes are
stored flat as child nodes under a top level node. To give structure
(categorisation) for the users, I have references to these nodes all over
the place so people can navigate them all sorts of different ways (as there
is no one clear hierarchy at the time the rules are created). JCR gives me
most of what I need,  but as these rule nodes can number in the thousands
(4000 is not uncommon for a reasonably complex business unit), then  I am
worried that  this just can't work.

I have seen from past posts that people put nodes under different parents
(so there is no great number of child nodes) so that is one option, but my
gut feel is that its the WeakIdentityCollection: this well meaning code
means that the GC has to due a huge amount of work at the worst possible
time (when under stress). I am sure most of the time this is not an issue.

Any ideas/tips/gotchas for a newbie? I would really like to be confident
that I can scale up enough (its modest) with JCR for this purpose.

On 8/31/06, Nicolas <ntoper@gmail.com> wrote:
>
> 2 more ideas:
>
> 1/ Did you try using a memory profiler so we can know what is wrong?
>
> 2/ What happens if you logout after say 100 updates?
>
>
> a+
> Nico
> my blog! http://www.deviant-abstraction.net !!
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message