jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: SegmentNodeStore merge operations
Date Thu, 07 Mar 2013 09:42:15 GMT
Hi,

On Thu, Mar 7, 2013 at 11:16 AM, Thomas Mueller <mueller@adobe.com> wrote:
>>So, apart from problem a (which also affects the new MongoMK), the
>>current mechanism works fine (i.e. fully parallel writes) as long as
>>the changes are non-conflicting, but runs into trouble when there are
>>conflicts.
>
> Sorry I don't understand, how does SegmentNodeStore merge affect the new
> MongoMK?

I was referring to problem a, which is about validators and other
commit hooks not being a part of the underlying MK-level merge
operation and thus for example not always catching things like
duplicate UUIDs being introduced or hard references being broken (i.e.
repository invariants that span more than one node). This issue also
affects the MongoMK implementation, though it's yet unclear how
important addressing it is in practice. For some deployments it may
well be a hard requirement.

>>* Use a more aggressive merge algorithm that automatically resolves
>>all conflicts by throwing away (or storing somewhere else) "less
>>important" changes when needed. Addresses problems b and c, problem a
>>still an issue.
>
> I'm worried our customers won't like this. It's very different from the
> behaviour of regular databases (be it relational databases, or NoSQL
> databases such as MongoDB). If it's a configurable for a certain subtree,
> for improved performance, then it's acceptable in my view, but even then
> I'm worried about the added complexity on the user/customer/developer
> side. And I'm worried that if we need to enable it to get a scalable
> solution, then it would turn people away.

Yes, that a valid and open question. My experience with the kinds of
high-volume scenarios where the write bottleneck has been a problem
that couldn't already be addressed with the optimistic locking
approach used by the SegmentMK, are cases that track user actions, log
other events or keep track of comments, likes, etc. In all such
scenarios each individual change is pretty small, seldom conflicting,
and not too important (i.e. the harm of losing one is minimal), so I'm
not too concerned about this being a problem as long as the system can
provide harder guarantees for "more important" updates.  But we'll
need some better benchmark scenarios and real-world experience to tell
how well that works in practice.

BR,

Jukka Zitting

Mime
View raw message