jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide Giannella <giannella.dav...@gmail.com>
Subject Re: Avoiding conflicts (OAK-1717)
Date Tue, 29 Apr 2014 15:06:11 GMT
On 29/04/2014 14:27, Michael Dürig wrote:
> Hi Davide,
>
> Sounds like an interesting idea. In fact, Tom and me discussed
> something very similar during lunch.
>
> In case of a "conflict" there would be multiple :next-xxx properties.
> When the index is used and those are traversed, the "right" :next-xxx
> property would need to be followed. At the same time the index could
> be marked as "dirty" so an async process could pick it up for cleaning
> at its convenience. We need to make sure this works for all types of
> conflicts. I.e. traversal in the face of multiple :next pointers
> doesn't accidentally drop a node. I don't think I have the full
> picture here yet.
Current situation

:index : {
    :start : { :next=n0 },
    :n0 : { :next=n2 },
    :n1 : { :next=NIL },
    :n2 : { :next=n1 }
}

What we're thinking around, let's say we have 2 cluster nodes

0) ContentStrategy (CS) on node A starts and elect a new ID as
previously said: `:next-1-1`

0.1) CS on node B starts and elect a new ID: `:next-2-2`

1) performing the operations we'll have something like

:index : {
    :start : {
        :next=,
        :next-1-1=,
        :next-2-2=
    }
    ...
    :nN : {
        :next=,
        :next-1-1=,
        :next-2-2=
    }
}

where we could have nodes with all the `:next*` as well as only the
`:next`; the one that should be full skiplist compliant.

2) OrderedIndexMerger (OIM) kicks in and starting from :start identify
all the `:next*` properties and in a way "mark" them for merging.

2.1) CS sees it marked and generate a new ID

3) OIM put the proper linking in place and delete the old property once
done.

As a way to "deliver the messages" around and giving the cluster the
chance to sync I was thinking:

2) OIM add a node under a proper structure

:index : {
    :tomerge : {
        1-1,
        2-2,
    }
}

2.1) before doing any linking operation each node check if the current
id is not under `:tomerge`. cluster sync and A sees 1-1 under :tomerge
mark for ACK and generate a new ID

:index : {
    :tomerge : {
        1-1 : { ack=true },
        2-2
    }
}

3) cluster sync, OIM as its duty inspect under `:tomerge` sees 1-1 with
ack and start putting it as final `:next`.

It's a bit complex but should work out any conflicts. Open point is:
what if a node is shutdown and could not ACK a merge? Probably working
out on a timeout. If it doesn't ACK in 3 cluster sync probably is dead
for us.

Serving searches could be done either by the `:next` stream or by the
`:next` and then merged on the fly with the current `:next-ID` in the
iterator while returning the resultset.

Any thoughts on the above?
> An alternative would be to change the linked list / skip list approach
> and use an ordered tree instead. This should also be free of conflicts
> and would also allow for skipping during traversal. As such trees
> might become unbalanced we'd also need some background process, which
> rebalances these trees. Rebalancing could occur on an as need basis:
> whenever an unbalanced tree is discovered when using an index, a
> balance operation would be scheduled.
I'm giving a look around at ordered/balanced trees (B-Tree in
particular) but I can't grasp on how we could have it working in our
situation without having to either link or sort in memory for siblings
and not risking a move of big sub-trees during insert/rebalancing. Plus
during rebalancing we risk collisions if we use pointers in the same way.

Any rough example on what you were thinking? Some quick stages like
empty, arrive 1, arrive 3, arrive 2, ... :)

D.


Mime
View raw message