jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Wiggen" <kwig...@xythos.com>
Subject RE: Scalability/Clustering
Date Wed, 06 Jul 2005 23:01:30 GMT

Sorry for the spam to the list, not intended....

Kevin

-----Original Message-----
From: Kevin Wiggen 
Sent: Wednesday, July 06, 2005 4:00 PM
To: jackrabbit-dev@incubator.apache.org
Subject: RE: Scalability/Clustering


Walter,

I hate spam to email lists, so I apologize if this message is not what
you are looking for.  That being said, my company Xythos Software Inc
has a large investment in being a back end for many e-learning projects
(including the BlackBoard Content Management System).  That being said
we have a number of institutions that are larger than the numbers you
suggest here and include scalability, clustering, point-in-time
recovery, replication, etc.

If this is something you would like to discuss in more detail, please
let me know.  Otherwise I am sorry for the unwarranted spam.

Kevin Wiggen
CTO
Xythos Software Inc

-----Original Message-----
From: Walter Raboch [mailto:wraboch@ingen.at] 
Sent: Wednesday, July 06, 2005 1:15 PM
To: jackrabbit-dev@incubator.apache.org
Subject: Scalability/Clustering

Hi all,

we just plan to use JackRabbit in an e-learning project with a few
hundred concurrent users. Therefore I am a little concerned about
scalability.

Some figures we forecast for the first expansion stage:
  1.000.000 Nodes
10.000.000 Properties (around 10 properties/node)
      3.000 Named Users (about 10% concurrent)

We think of a n-tier architecture with a web and application layer, a
repository layer and the database layer with 2 or more nodes for each
layer. There are either Java and .net applications accessing the content
in the repository, so we are planing to implement a .net client for
JSR170 too.

What would be the best deployment model for such a situation in your
opinion?

Are there any efforts to make jackrabbit clustered for a load sharing
scenario (no session failover at repository layer) ?

After reading a lot of code, I think following changes should do it:

- extending ObservationManager to send and receive Events to
   and from other nodes

- implementing/extending an ORM Layer (Hibernate with shared caching for
   performance). The persistence implementation should be aware of the
   node types and allow a type specific mapping to tables. So we can map
   nodetypes with many instances to own tables while maintaining
   flexibility for new "simple" nodetypes.

- extending LockManager to sync locks with other Nodes

- Lucene should be indepentend on each node but be aware of new nodes
   and changes -> Events from ObservationManager

- Config - the cluster should have a central place for config management

- some intelligence in the JCR-RMI client to find a content repository
   node from the cluster dependending on node state (load, shutdown,
...)

What else should be synchronized between the nodes?
Did I overlook something?

I am happy about any suggestions even if you dicourage us from using
jackrabbit. Of course we would release some of these developments to the
community - if someone is interested.

thx in advance,

cheers
Walter






Mime
View raw message