jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig <mdue...@apache.org>
Subject Re: [jr3 trade consistency for availability]
Date Tue, 28 Feb 2012 19:15:25 GMT
>>>> What are the consistency assumptions a JCR client should be allowed to
>>>> make?
>>>>
>>>> An approach where temporary inconsistencies are tolerated (i.e. eventual
>>>> consistency) increases availability and throughput. In such a case
>>>> do/can/should we tolerate temporary violations of:
>>>>
>>>> - Node type constraints?
>>>
>>> so far we seem to have only discussed edge cases where node type
>>> constraints could be violated. I think, they are not too relevant in
>>> a real life system. I'd be OK to make some compromises in this area.
>>
>> With the current Microkernel whether these cases (i.e. write skew) [1]
>> are edge case or not depends on the degree of write concurrency we
>> anticipate. If we fully synchronize all writes, these cases wont occur
>> at all. If OTOH we aim for highly concurrent writes, we will see such
>> cases possibly more often than we like.
>
> I think most applications that have highly concurrent writes usually
> distribute the writes across many nodes. e.g. you have lots of users
> working with the system, but each of them is working with his/her
> own dataset.

This is correct as long as we exclude collaborative workspace use cases 
where users typically work on the same document concurrently.

[...]

> To me the example on the wiki page is a reason to drop support
> for setPrimaryType() for jr3. The specification says:

Agreed. Note however, that the same problem also occurs for mixins.

[...]

> Do we have other examples where we know consistency from a
> JCR perspective is at risk?

Referential integrity for mix:referenceable nodes might break in the 
same way.

The problem occurs anywhere where parts of the data in a save depend in 
some way on other parts of that save. For example when two properties of 
a node need to obey a certain condition.
This might make it also hard to implement things like versioning since 
the implementation must then encode dependent JCR properties into the 
same JSON value of the underlying Microkernel in order circumvent this 
problem.

As discussed in an earlier thread, the problem is easily fixed for 
direct clients of the Microkernel API if we add some testAndSet 
functionality to the Microkernel.

Michael



>
>> [1]
>> http://wiki.apache.org/jackrabbit/Transactional%20model%20of%20the%20
>> Microkernel%20based%20Jackrabbit%20prototype
>>
>>>> - Access control rights?
>>>
>>> I don't think any violations are acceptable here.
>>
>> Me neither. But again we need to be aware of the write skew issue here:
>> an ACL implementation must be very careful about its consistency
>> assumptions or it will eventually fail.
>>
>>>> - Lock enforcement?
>>>
>>> that's definitively a tough one because it depends on repository
>>> wide state.
>>
>> This is an area where Apache Zookeeper might help out.
>>
>>>> - Query index consistency?
>>>
>>> I think consistency is a prerequisite here, otherwise it's quite
>>> difficult to implement the query functionality. I'd rather
>>> make compromises for availability. eg. terminate a long query
>>> execution with an exception because the snapshot it was
>>> working on is not available anymore.
>>
>> I was more thinking of the other direction: would it be tolerable to
>> have the query index not up to date yet? (i.e. after a possibly large
>> save.) Again, this could either result in incomplete query results, an
>> exception or the query to be deferred until the index is up to date.
>> Maybe we could even let the client chose through 'query hints'.
>
> I like the query hint idea.
>
> alternatively we could also deny access to the most recent revision
> until the index is updated (possibly asynchronously). this way
> reads and writes are fast at the cost of consistency. reads would
> be eventually consistent (once index is updated).
>
> regards
>   marcel
>
>> Michael
>>
>>>
>>>> - Atomicity of save operations?
>>>
>>> how does a temporary violation of atomic saves look like?
>>> are you thinking of partially visible changes?
>>>
>>> regards
>>>    marcel
>>>
>>>> - ...?
>>>>
>>>> Should we offer alternatives in some of these cases? That is, give the
>>>> client the ability to choose between consistency and availability.
>>>>
>>>> Michael
>>>>
>>>>
>>>> [1]
>>>>
>> http://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20J
>>>> ackrabbit%203

Mime
View raw message