jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Problems with re-index a huge repository
Date Thu, 31 Aug 2006 20:12:15 GMT
some more information on the current (1.0.1) working of the search index:

KÖLL Claus wrote:
> for me its strange that during the index process lucene creates
> about 600 - 700 directories under the index folder in the workspace
> directory and the redo.log is about 25Mb and then i get a
> outofmemoryexception. at the time of initial filling of the
> repository the merge of the index folders/files works fine but now
> it seems that the merger does not work.

the index merger works and is running but because the whole re-index 
update is within a single transaction the indexes that are obsolete 
after an index merge are not deleted right away. If the re-index 
process would successfully run the obsolete indexes would be deleted 
at the end.

I've changed this behaviour because it only makes sense for regular 
index updates and not for a re-index update. See JCR issue: 
http://issues.apache.org/jira/browse/JCR-554

> if i restart the repository after the exception occurs the index
> folders/files will be merged into about 20-30 folders but the
> repository is not indexed whole.

This is because of a bug in the search index. When a re-indexing 
process is interrupted and the repository is restarted afterwards the 
search index does not undo the partial re-index. This bug has also 
been fixed with the changes for issue JCR-554. If a re-indexing 
process is now interrupted all index updates are undone and 
re-indexing is started again.

regards
  marcel

Mime
View raw message