jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: [jr3 microkernel] Write skew
Date Thu, 01 Dec 2011 17:13:00 GMT
Hi,

>>>But I don't think we should try to increase concurrency of write
>>> operations within the *same* repository because that's not a problem at
>>> all.
>>
>> i beg to differ ;)
>>
>> in jr2 saves are serialized. IMO that's a *real* problem, especially
>>when
>> saving large change sets. this problem can be addressed e.g. with an
>> MVCC based model.

The problem with Jackrabbit isn't that concurrency for write operations is
bad: throughput is bad. This is the main problem. Increasing concurrency
in the save operation will not affect throughput in a meaningful way
(well, most likely it will decrease throughput).

I'm not aware that there is a big problem with large change sets. Anyway
large change sets should be split up into smaller set.

For me, increasing throughput is a lot more important than increasing
concurrency.



>Yes, I agree. It's something I've seen many times in the field
>(consider saving a large pdf in a cms).

Large PDFs are stored in the data store. Large binaries are stored there
well before the save operation, so this is not part of the save operation
at all. Increasing concurrency in the save operation doesn't affect that
in any way.

>you can't scale out the writes in a
>cluster since all writes are serialized for the whole cluster.

Yes, this is a big problem. We need to solve it. One idea is to
synchronize cluster nodes asynchronously, and better support splitting
data into multiple repositories (sharding), for example using virtual
repositories that can be linked together.

Regards,
Thomas


Mime
View raw message