jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: my managers are against jackrabbit
Date Mon, 13 Sep 2010 17:09:47 GMT

On Mon, Sep 13, 2010 at 6:45 PM, Seidel. Robert <Robert.Seidel@aeb.de> wrote:
> o    In our test we could store 3-4 nodes with data a second (with 40 properties
> and a 400 byte clob (with versioning))

You're probably doing something wrong. A basic performance test in the
Jackrabbit benchmark suite (see test/performance under Jackrabbit
trunk) can create 5000 new nodes per second on my mid-level desktop

What's your persistence configuration? Another possible cause is the
high amount of time spent indexing your content, see more below.

> o    Clustering doesn't help because it doesn't scale storage performance

Clustering helps notably for read workloads, but won't help with write access.

> o    If the database crashes, then everything is lost - you have to recover from
> database backup and store the work of the day again

You can use clustering for high availability.

> o    The index size for 300000 nodes was really huge, it was about 2 gb (36 of
> the 40 properties are random Unicode strings with a size of 20 characters)
> o    The repository index was about 1,6 gb and the workspace index was
> something like 400 mb (versioning)
> Is there a way to improve storage performance/index size?

Random strings are not best suited for an inverse index like Lucene.
If you don't need the ability to search your nodes based on these
strings, you can disable indexing of those properties with a custom
indexing configuration.


Jukka Zitting

View raw message