jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Guggisberg <stefan.guggisb...@gmail.com>
Subject Re: Jackrabbit Performance
Date Mon, 26 Sep 2005 10:56:54 GMT
hi daniel,

some remarks/answers follow inline:

On 9/25/05, Daniel Hagen <dhagen@h1-software.de> wrote:
> Hi,
>
> I apologize if this is the wrong place to ask my questions but I do not know
> where else I should ask.
>
> I am currently considering the use of Jackrabbit in a future project.
> The (very) rough layout I am thinking about is Jboss as Application Server
> and Jackrabbit for content storage (equipped with a custom access manager
> and login module for authentication & authorization).
>
> But I am not sure whether Jackrabbit will be able to handle the amount of
> data we will have to deal with.
> The application might have to handle ~ 2000 - 5000 new documents/day (size
> ranging from 2kb to 1 mb, I assume an average of ~50 KB).
> Each document will have about 5 - 10 simple text properties and the "binary"
> content of the documents (plain text/HTML/MS Word/PDF) will have to be
> indexed for a fulltext search.
> Read access to the contents will not be very frequent, I am assuming 5
> requests for the mentionened simple properties of a node per minute, 5
> concurrent users, access to binary contents will propably appear once every
> minute.
>
> In short: The application will have to be able to do a fulltext search on
> (worst case) more than 10,000,000 contents and will have to handle creation
> of new contents without stalling the server.
>
> What is your opinion, is Jackrabbit the right tool for the task?
> Which Persistence Manager would be the best choice?
> Are there any special hardware considerations I should think about (e.g.
> separating index and storage on separate discs using separate controllers
> ...)?
> Should we have OS preferences for the server (current options are Windows
> 2003 Server vs. Linux with a strong preference towards Windows 2003 Server)?

if you're using a filesystem-based pm (e.g. ObjectPersistenceManager on
LocalFileSystem) i'd definitely go for linux. the windows filesystem really
sucks whith a large number of small files. with the CQFileSystem
(custom filesystem in-a-file) you can improve the performance on a windows
box considerably but it's not opensource and it's only free for non-commercial
use.

ObjectPersistenceManager w/LocalFileSystem on a linux box provides imo
a decent performance, it's major flaw is that it is non-transactional.

there's also a jdbc-based pm in the contrib directory (contrib/db-persistence).
it is transactional and, depending on the type of database, provides a very
decent performance (e.g. mysql).

i suggest you setup your own performance/scalability tests.

cheers
stefan

>
> I know that not all of my questions are directly related to Jackrabbit
> Development and some will propably not be answered due to a lack of existing
> data, but any clues/hints will be greatly appreciated.
>
> Thank you for your help!
>
> Daniel
>
>

Mime
View raw message