Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@jackrabbit.apache.org
Received-SPF: pass (nike.apache.org: domain of unico.apache@gmail.com
 designates 209.85.212.42 as permitted sender)
MIME-Version: 1.0
Sender: unico.apache@gmail.com
Date: Thu, 22 Mar 2012 09:19:16 +0100
Message-ID: 
 <CAP8w+6RzsnxebodHSQe6sjOZo2vb8V7xz3iXsCisdSM2UQgxaQ@mail.gmail.com>
Subject: Data corruption knowledge sharing
From: Unico Homme <unico@apache.org>
To: dev@jackrabbit.apache.org
Content-Type: multipart/alternative; boundary=20cf3079bd4c609c8c04bbd0903e

--20cf3079bd4c609c8c04bbd0903e
Content-Type: text/plain; charset=ISO-8859-1

Hi all,

As you may have noticed I have been working on trying to improve the
consistency checker. There are quite some improvements I want to make and I
will try as much as possible to create separate issues for them to maximise
the transparency of what we want to achieve in this area. Until now we had
our own checker that is faster, checks for more different cases of
inconsistency, and fixes more cases of inconsistency than the current
checker in Jackrabbit. We'd like to donate and incorporate the knowledge
and experience that is reflected in our own checker to the JR checker.

As a parallel to that work I'd like to initiate a knowledge sharing
discussion about the possible causes of data corruption in Jackrabbit. At
Hippo we experience data corruption on a regular basis. And although we
have spent quite some time analysing the possible causes of this corruption
we've not been able until now to minimize it to our satisfaction.

Questions such as,

1. what are the scenario's in which data corruption is most likely to occur
2. what are the best strategies to minimize the chance of data corruption
in an application on top of JR
3. is the main cause of data corruption concurrrent writes on multiple
cluster nodes
4. or have you also seen significant problems in single cluster set ups
5. do you have ideas or suggestions how JR itself can be improved to be
more robust against data corruption

One of the scenario's we've experienced that causes regular data corruption
is concurrent add node operations on the same folder from different cluster
nodes.

For one use case in which such collisions can occur regularly we've
implemented a strategy to avoid this corruption using separate folders for
different cluster nodes. Off course this course is not always an option but
in certain scenario's it seems to be a good solution.

If you have similar experiences and/or know of other typical scenarios in
which data corruption can occur I'd love to hear about them.

Best regards,
Unico Hommes
Hippo BV, the Netherlands

--20cf3079bd4c609c8c04bbd0903e
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi all,<br><br>As you may have noticed I have been working on trying to imp=
rove the consistency checker. There are quite some improvements I want to m=
ake and I will try as much as possible to create separate issues for them t=
o maximise the transparency of what we want to achieve in this area. Until =
now we had our own checker that is faster, checks for more different cases =
of inconsistency, and fixes more cases of inconsistency than the current ch=
ecker in Jackrabbit. We&#39;d like to donate and incorporate the knowledge =
and experience that is reflected in our own checker to the JR checker.<br>

<br>As a parallel to that work I&#39;d like to initiate a knowledge sharing=
 discussion about the possible causes of data corruption in Jackrabbit. At =
Hippo we experience data corruption on a regular basis. And although we hav=
e spent quite some time analysing the possible causes of this corruption we=
&#39;ve not been able until now to minimize it to our satisfaction.<br>

<br>Questions such as, <br><br>1. what are the scenario&#39;s in which data=
 corruption is most likely to occur<br>2. what are the best strategies to m=
inimize the chance of data corruption in an application on top of JR<br>

3. is the main cause of data corruption concurrrent writes on multiple clus=
ter nodes<br>4. or have you also seen significant problems in single cluste=
r set ups<br>5. do you have ideas or suggestions how JR itself can be impro=
ved to be more robust against data corruption<br>

<br>One of the scenario&#39;s we&#39;ve experienced that causes regular dat=
a corruption is concurrent add node operations on the same folder from diff=
erent cluster nodes.<br><br>For one use case in which such collisions can o=
ccur regularly we&#39;ve implemented a strategy to avoid this corruption us=
ing separate folders for different cluster nodes. Off course this course is=
 not always an option but in certain scenario&#39;s it seems to be a good s=
olution.<br>

<br>If you have similar experiences and/or know of other typical scenarios =
in which data corruption can occur I&#39;d love to hear about them.<br><br>=
Best regards,<br>Unico Hommes<br>Hippo BV, the Netherlands<br><br><br>
<br>

--20cf3079bd4c609c8c04bbd0903e--