jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Handling copies and moves with tree diffs
Date Tue, 27 Nov 2012 14:43:41 GMT
Hi,

On Tue, Nov 27, 2012 at 12:56 PM, Jukka Zitting <jukka.zitting@gmail.com> wrote:
> I did some work along these lines already earlier with the
> CopyAndMoveAwareJsopDiff class you can find inside KernelRootBuilder,
> but couldn't make it work properly. Based on discussions with Michael
> last week I think we should be able to come up with an algorithm that
> works pretty well for this purpose. More on that in follow-ups.

Here's one simple algorithm that should be able to correctly handle up
to a single copy or move per commit. First the handling of changes in
KernelNodeBuilder:

A1) In KernelRootBuilder we keep track of the path of the *target* of
a copy or move operation. Until such an operation occurs, that path
remains null.

A2) In KernelNodeBuilder.setNode() we check if the new child NodeState
is based on the same MK revision as the parent but has a different
path than the identified new child (and is not one of its
descendants).

A21) If not, we use normal tree diff to find out which basic
add/set/remove operations to update the builder subtree to match that
NodeState.

A22) If yes, we update the target path in KernelRootBuilder to point
to the new child node and use the given NodeState as the base of that
builder subtree.

If A22 is  reached when the target path in KernelRootBuilder is
already set, all changes are automatically purged to a branch and the
KernelRootBuilder reset to the new purged state before continuing the
A22 step. Changes are also purged whenever a predefined purge limit is
reached or commit is requested from a higher level. The JSOP diff for
that purge commit is constructed as follows:

B1) If the target path in KernelRootBuilder is set (and the
copied/moved node at that path still exists), we first construct the
relevant copy or move operation:

B11) If the target bath already existed in the previous revision,
we're dealing with a replacement. Emit a JSOP delete instruction for
that path.

B12) If the original path of that subtree still exists, we're dealing
with a copy operation. Emit  the JSOP copy instruction.

B13) If the original path no longer exists, we're dealing with a move
operation. Emit the JSOP move operation and keep track of the original
removed path.

B2) Process the rest of the changes using the normal JsopDiff
mechanism with the following modifications:

B21) When encountering the target of the copy/move operation, instead
of processing it normally as an added or modified subtree do a
separate JsopDiff against the base state from the original source path
to capture any post-copy/move changes to that subtree.

B22) When encountering the removed path from B13, suppress the normal
JSOP delete instruction that would have been generated by the standard
JsopDiff class.

It should be straightforward to generalize this algorithm to cover any
number of *unconflicting* copies/moves in a single commit (conflicting
changes would still need to be handled as a sequence of purged branch
commits). Most notably this approach should be able to accurately map
all JCR-level direct-to-workspace copy/move operations (including more
complex things like checkin) to single MK commits.

BR,

Jukka Zitting

Mime
View raw message