jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Nuescheler" <david.nuesche...@gmail.com>
Subject Re: Scalability concerns, Alfresco performance tests
Date Mon, 04 Dec 2006 13:22:46 GMT
Hi Andreas,

> Now, a news message [1] on TheServerSide about benchmarks provided
> by Alfresco to prove the superiority
ermhh.... let's say "state" not "prove" ;)

> ...of their JCR implementation raises some concerns.
I guess that this may exactly have been the intention ;)

Also, the term "JCR implementation" may not be technically
accurate, maybe someone could point me to an updated
version of this:

> A post in the thread claims that Jackrabbit isn't suited for
> large-scale scenarios and faces some problems in the transactional
> handling of some 100.000 nodes (Kev Smith, [2]):
While Kev possibly has reasons to believe that, I don't.
(Unless he talks about some 100k nodes a single transaction
and a given memory size.)

> "From what we've seen, Alfresco is comparable to JackRabbit for small
> case scenarios - but Alfresco is much more scalable [...]"
> Do you agree to this statement? If yes - are these problems related
> to the persistence manager abstraction? Is this a known issue, and
> will it be addressed?
I do not even remotely agree with this statement.
Jackrabbit has been built to scale freely in size.

I have a hard time understanding this argument since both Jackrabbit
and Alfresco can use the same RDBMS as the persistence layer, so
at least on the persistence layer there should not be a substantial
difference. Thoughts?

> "We tried to load up JackRabbit with millions of nodes but always ran
> into blocker issues after about 2 million or so objects. Also when
> loading up JackRabbit, the load needed to be carefully performed in
> small chunks e.g. trying to load in 100,000 nodes at a time would cause
> PermGenSpace errors (even with a HUGE permgenspace!) and potentially
> place the repo into a non-recoverable state."
> I'm not sure if this will really be an issue for our usage
> scenario (except maybe from restoring backups), but I'm very
> interested in your opinions.
That's true, the size of the non-binary portions of a commit are
"currently" memory constrained.
"Backup/Restore" operations in my experience usually happen on the
persistence layer, which means that restore operation (obviously) does
not go through the normal user API. I actually would go as far as stating
that it would be close to abuse of the API to go through the transient layer
to restore an entire content repository.
We are currently working on a solution for that, but since nobody had
a pressing need, it had a relatively low priority. If this is a pressing issue
for your project feel free to file a JIRA issue.


View raw message