jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Consistency check of indexes takes to long
Date Tue, 14 Aug 2007 13:19:43 GMT
Hi Christoph,

Christoph Kiehl wrote:
> we've got very big indexes/workspaces on our production servers which 
> have from 3,000,000 to 8,000,000 nodes and are still growing because of 
> creation of versions and adding new nodes.
> When it happens that the VM in which Jackrabbit lives in crashes during 
> a write operation, Jackrabbit nicely applies the redo log on a restart 
> which gets done quite quick but then starts its consistency check. This 
> check takes from 30 minutes to 2 hours depending on the repository. In 
> this time our application is offline which we would of course like to 
> avoid ;) Our system uses a bundle oracle pm which probably doesn't make 
> things better.
> I had a quick glance at the consistency check code and it seems like 
> there is nothing that could be substantially optimized in that place. I 
> thought it might be possible to just include those index segments that 
> where used while replaying the redo log but as the consistency check 
> works this is impossible.
> I think the only way to fasten startup is to avoid the occurrence of the 
> errors that the check is checking for at all. Since the redo log 
> mechanism seems quite good I'm not sure if those errors 
> (MissingAncestor, MultipleEntries, NodeDeleted, UnknownParent) can still 
> occur. Could you maybe elaborate on the situations where you expect 
> those errors to arise?

IIRC the consistence check was introduced first in jackrabbit and later the redo 
log mechanism, which makes the consistence check kind of superfluous.

> For now I'm thinking about disabling consistency checks at all by 
> default and run them in a maintenance window at night. Unfortunately 
> this might be a bit dangerous if parts of the application rely on 
> certain nodes to be found by queries :/

I agree with you. We could introduce a third configuration value for 
forceConsistenceCheck (in addition to 'true' and 'false'): disabled. that would 
then be the default in the next released version of jackrabbit.



View raw message