jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KÖLL Claus <C.KO...@TIROL.GV.AT>
Subject AW: problems with re-indexing the workspace
Date Mon, 28 Aug 2006 12:11:35 GMT
hi jukka ..

in my case (as described in a other mail ... repository with 2 Million documents) i have these
parameters for the lucene indexer ..
to re-index the workspace

<param name="minMergeDocs" value="1000"/>
<param name="maxMergeDocs" value="1000000"/>
<param name="mergeFactor" value="10"/> 

so my opinion is that the merger should work
if there are 10 index folders with 1000 nodes into a single index folder with 1000*10 nodes
similarly when there are 10 index folders with each 1000*10 nodes and so on till we reach
the maxMergeDocs size

if i re-index the repository it runs about 6-7 hours and then i get the outofmemory and about
700 index folders.
for me its not pursuable why there are so much index folders.
after the error occurs i restart again the repository without deleting the index folders and
they get merged into about 40-90 folders

maybe there is a bug in the merger and he doesnt work right while the filter scans the documents
on a re-index process
if i put some documents into the reposiory again the merger works great when i call the save
Method on the session.

i dont know how i can get a usefully debug trace from the re-index process because after about
7 hours the log file is very large
if i enable the debug level on log4j

claus
-----Ursprüngliche Nachricht-----
Von: Jukka Zitting [mailto:jukka.zitting@gmail.com] 
Gesendet: Montag, 28. August 2006 13:51
An: users@jackrabbit.apache.org
Betreff: Re: problems with re-indexing the workspace

Hi,

On 8/28/06, Christian Zanata <christian.zanata@wavegroup.it> wrote:
> [ERROR] 20060825 17:06:40
> (org.apache.jackrabbit.core.observation.ObservationManagerFactory) -
> Synchronous EventConsumer threw exception. java.lang.OutOfMemoryError
>
> This error seems happening when the repository tries to re-index the
> workspace, but we don't have more stack traces.
> [...]
> could anybody heps us to understand what's happening?

There are two likely causes for that; either Lucene is running out of
memory while merging the index segments, or one of the index filters
runs out of memory trying to parse one of the binary documents in the
repository. Without a complete stack trace it is difficult to
determine the exact cause of the problems.

You might want to try modifying the Lucene parameters in the
SearchIndex configuration. See the Lucene documentation for options
that affect memory usage.

BR,

Jukka Zitting

-- 
Yukatan - http://yukatan.fi/ - info@yukatan.fi
Software craftsmanship, JCR consulting, and Java development

Mime
View raw message