Return-Path: X-Original-To: apmail-jackrabbit-dev-archive@www.apache.org Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0B3091B6 for ; Mon, 6 Feb 2012 12:12:43 +0000 (UTC) Received: (qmail 6250 invoked by uid 500); 6 Feb 2012 12:12:43 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 6187 invoked by uid 500); 6 Feb 2012 12:12:42 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 6180 invoked by uid 99); 6 Feb 2012 12:12:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Feb 2012 12:12:42 +0000 X-ASF-Spam-Status: No, hits=-1.3 required=5.0 tests=FRT_ADOBE2,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of aklimets@adobe.com designates 64.18.1.39 as permitted sender) Received: from [64.18.1.39] (HELO exprod6og117.obsmtp.com) (64.18.1.39) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Feb 2012 12:12:35 +0000 Received: from outbound-smtp-1.corp.adobe.com ([192.150.11.134]) by exprod6ob117.postini.com ([64.18.5.12]) with SMTP ID DSNKTy/DnrDbPoU5Uctvry1jH+RNZ6MJcCRv@postini.com; Mon, 06 Feb 2012 04:12:14 PST Received: from inner-relay-4.eur.adobe.com (inner-relay-4.adobe.com [193.104.215.14]) by outbound-smtp-1.corp.adobe.com (8.12.10/8.12.10) with ESMTP id q16CAI0Y018098 for ; Mon, 6 Feb 2012 04:10:19 -0800 (PST) Received: from nahub01.corp.adobe.com (nahub01.corp.adobe.com [10.8.189.97]) by inner-relay-4.eur.adobe.com (8.12.10/8.12.9) with ESMTP id q16CCBPl000264 for ; Mon, 6 Feb 2012 04:12:12 -0800 (PST) Received: from eurcas01.eur.adobe.com (10.128.4.27) by nahub01.corp.adobe.com (10.8.189.97) with Microsoft SMTP Server (TLS) id 8.3.192.1; Mon, 6 Feb 2012 04:12:11 -0800 Received: from eurmbx01.eur.adobe.com ([10.128.4.32]) by eurcas01.eur.adobe.com ([10.128.4.27]) with mapi; Mon, 6 Feb 2012 12:12:09 +0000 From: Alexander Klimetschek To: "dev@jackrabbit.apache.org" Date: Mon, 6 Feb 2012 12:11:50 +0000 Subject: Re: [jr3 microkernel] Change log consolidation Thread-Topic: [jr3 microkernel] Change log consolidation Thread-Index: AczkyIwqf++7W06BSXS3lRGfe3wjxQ== Message-ID: In-Reply-To: Accept-Language: de-DE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.14.0.111121 acceptlanguage: de-DE, en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 >From looking at the use cases I agree with Thomas - in reality you probably don't reduce much for an individual change list, since these are typically adding one or few nodes + a bunch of properties, or removing a few, but not necessarily mixing multiple move, add & remove operations in a single session. For some edge cases (as the mentioned move + move back) it is obviously useful and including it should not hurt, if the algorithm is proven. The more interesting problem is the conflict handling against a committed changelist, or a concurrently applied changelist (thinking of a distributed cluster). Maybe the same approach using operations to merge concurrent changes (instead of operations vs. persisted state) can be used here. If you don't have to look at the state (i.e. possibly the entire repository), things might be faster or less memory intensive. This sounds similar to the Operational transformation (OT) approach [0] used for concurrent text editors. However, OT turns out to be quite difficult to formally prove even for insert/remove actions in a text, when concurrency comes into play. With hierarchy operations, this might even be more complex. But maybe worth a thought at least (and more Mathematicians like Michi :-) on the team). [0] http://en.wikipedia.org/wiki/Operational_transformation Cheers, Alex On 06.02.12 12:30, "Thomas Mueller" wrote: >Hi, > >>An alternative that AFAICT achieves the same effect in perhaps an >>easier to understand manner would be to use the state of the content >>tree instead of changes to that state as the key concept. Instead of a >>commit(changeLog, baseState) method we could have a commit(newState, >>baseState) method for persisting changes. > >That's what I implemented for my first prototype (sandbox/jackrabbit-j3). > >>PS. You might already have discussed and dismissed this idea off-list. >>If yes, what was the deal-breaker? > >I think the deal breaker is the API, which currently uses the 'diff' style >instead of the 'absolute target state' style. > >For me, it doesn't matter. Either way is fine (diff style or sending the >full state). The diff style requires a bit less data to be transferred I >guess. > >However, I personally wouldn't consolidate the change log currently. I >don't think it's necessary, and for the normal case (which is _not_ a >randomly generated change log, but a manually generated one) I don't think >it will help a lot. Also, I don't currently see that would help a lot >resolving conflicts. The conflicts it can resolve seem to be weird cases >(my opinion, from what I know so far), which are not that important to be >resolved. Why would somebody move a node twice? If he does, it's his >problem (the applications problem). And why would we want to avoid a >conflict if somebody deletes the intermediate node in the meantime? What I >mean is, failure to merge such a case would be just fine. Maybe there are >other, more important cases that I currently don't know about. > >I would probably not consolidate the change log, because it simpler not >to. Unless it turns out to be a problem in practice (not just in theory). >Still, it is an interesting idea. > >Regards, >Thomas > > --=20 Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel