Return-Path: X-Original-To: apmail-jackrabbit-dev-archive@www.apache.org Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B5A19CDA for ; Thu, 22 Mar 2012 08:19:50 +0000 (UTC) Received: (qmail 16035 invoked by uid 500); 22 Mar 2012 08:19:50 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 15855 invoked by uid 500); 22 Mar 2012 08:19:46 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 15826 invoked by uid 99); 22 Mar 2012 08:19:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Mar 2012 08:19:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of unico.apache@gmail.com designates 209.85.212.42 as permitted sender) Received: from [209.85.212.42] (HELO mail-vb0-f42.google.com) (209.85.212.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Mar 2012 08:19:37 +0000 Received: by vbjk13 with SMTP id k13so2041549vbj.1 for ; Thu, 22 Mar 2012 01:19:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=Tz+zigs1/qRM4mVN7eARRNBZEkBifKKwBLn6QTApE+c=; b=wJ+qrPTr9taURl9+IYMLGcvp0HN04Kr5J4d9WtlS/KT9knVHCMuk48vvWQ3ebJqSNu JfXUOXKq2JRzkjR1kMyMqNuKnm4FLBX+OOTGUQrNi1fDkFx5YeXaNH8Wtr+pUvuolIxf ZHSBEv8+WHWPWia0GyMwPxpUVr7nS4kS4fms/ai8zFwwdbLEZ2tP8M7e+iGqf07eneck i+/ANLSlx5QJthUhBXwRRDq9paJT9gAQWpyut6kTlzRhM1OEo1BFodvYFlDjGirX8J0s 9P15mpXHBl4WJzm+VAv6ZO4gOMdyXg/CJUYAZT3mxthimOLnIzkB0w58GVnTJUSFD2zd eroQ== MIME-Version: 1.0 Received: by 10.52.27.70 with SMTP id r6mr2867141vdg.41.1332404356684; Thu, 22 Mar 2012 01:19:16 -0700 (PDT) Sender: unico.apache@gmail.com Received: by 10.52.68.83 with HTTP; Thu, 22 Mar 2012 01:19:16 -0700 (PDT) Date: Thu, 22 Mar 2012 09:19:16 +0100 X-Google-Sender-Auth: 9TIEMshS28ZVBa7BZpt-Rq8mU-Y Message-ID: Subject: Data corruption knowledge sharing From: Unico Homme To: dev@jackrabbit.apache.org Content-Type: multipart/alternative; boundary=20cf3079bd4c609c8c04bbd0903e X-Virus-Checked: Checked by ClamAV on apache.org --20cf3079bd4c609c8c04bbd0903e Content-Type: text/plain; charset=ISO-8859-1 Hi all, As you may have noticed I have been working on trying to improve the consistency checker. There are quite some improvements I want to make and I will try as much as possible to create separate issues for them to maximise the transparency of what we want to achieve in this area. Until now we had our own checker that is faster, checks for more different cases of inconsistency, and fixes more cases of inconsistency than the current checker in Jackrabbit. We'd like to donate and incorporate the knowledge and experience that is reflected in our own checker to the JR checker. As a parallel to that work I'd like to initiate a knowledge sharing discussion about the possible causes of data corruption in Jackrabbit. At Hippo we experience data corruption on a regular basis. And although we have spent quite some time analysing the possible causes of this corruption we've not been able until now to minimize it to our satisfaction. Questions such as, 1. what are the scenario's in which data corruption is most likely to occur 2. what are the best strategies to minimize the chance of data corruption in an application on top of JR 3. is the main cause of data corruption concurrrent writes on multiple cluster nodes 4. or have you also seen significant problems in single cluster set ups 5. do you have ideas or suggestions how JR itself can be improved to be more robust against data corruption One of the scenario's we've experienced that causes regular data corruption is concurrent add node operations on the same folder from different cluster nodes. For one use case in which such collisions can occur regularly we've implemented a strategy to avoid this corruption using separate folders for different cluster nodes. Off course this course is not always an option but in certain scenario's it seems to be a good solution. If you have similar experiences and/or know of other typical scenarios in which data corruption can occur I'd love to hear about them. Best regards, Unico Hommes Hippo BV, the Netherlands --20cf3079bd4c609c8c04bbd0903e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,

As you may have noticed I have been working on trying to imp= rove the consistency checker. There are quite some improvements I want to m= ake and I will try as much as possible to create separate issues for them t= o maximise the transparency of what we want to achieve in this area. Until = now we had our own checker that is faster, checks for more different cases = of inconsistency, and fixes more cases of inconsistency than the current ch= ecker in Jackrabbit. We'd like to donate and incorporate the knowledge = and experience that is reflected in our own checker to the JR checker.

As a parallel to that work I'd like to initiate a knowledge sharing= discussion about the possible causes of data corruption in Jackrabbit. At = Hippo we experience data corruption on a regular basis. And although we hav= e spent quite some time analysing the possible causes of this corruption we= 've not been able until now to minimize it to our satisfaction.

Questions such as,

1. what are the scenario's in which data= corruption is most likely to occur
2. what are the best strategies to m= inimize the chance of data corruption in an application on top of JR
3. is the main cause of data corruption concurrrent writes on multiple clus= ter nodes
4. or have you also seen significant problems in single cluste= r set ups
5. do you have ideas or suggestions how JR itself can be impro= ved to be more robust against data corruption

One of the scenario's we've experienced that causes regular dat= a corruption is concurrent add node operations on the same folder from diff= erent cluster nodes.

For one use case in which such collisions can o= ccur regularly we've implemented a strategy to avoid this corruption us= ing separate folders for different cluster nodes. Off course this course is= not always an option but in certain scenario's it seems to be a good s= olution.

If you have similar experiences and/or know of other typical scenarios = in which data corruption can occur I'd love to hear about them.

= Best regards,
Unico Hommes
Hippo BV, the Netherlands



--20cf3079bd4c609c8c04bbd0903e--