jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@day.com>
Subject Re: Scalability issues while storing large number of small properties
Date Wed, 27 May 2009 12:12:50 GMT
On Wed, May 27, 2009 at 3:40 AM, aasoj j <aas.ojj@gmail.com> wrote:
> Our application has a huge number of properties, around 1 million. These
> properties are distributed in versionable jackrabbit nodes, each node having
> around 50 properties and 15 children nodes. Each property has a unique 50
> character long value. We use MySql for persistence.
>
> While creating the tree our application crashed. The indexes grew to more
> that 4.5 GB. Later when we tried to remove the root node, the indexes grew
> to 15 GB and the application crashed again. As we plan to use search
> functionality, we cannot disable indexes.

That the index can grow that much could be "ok", because with all
those unique values it probably blows up a lot. But this is just my
assumption, I don't know the exact behaviour of the Lucene index in
that case.

I guess the outofmemory is not related to the index, but rather to the
removal of the whole tree. Removing nodes is currently memory-bound,
as the whole removal process happens in the transient part of the
session, which is solely kept in memory. If you delete the root node,
the entire repository will be put into the memory, with some overhead
(albeit I am wondering why the factor is around 150 (100MB -> 15GB).
As a workaround, you could remove smaller subtrees and call save() in
between.

BTW, I guess the use case for removing the whole tree is just for
development, where you want to reimport or populate your repository
again. If you delete everything anyway, you can just swipe away your
database and the contents of the workspace directory (except for the
workspace.xml file).

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Mime
View raw message