jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cech. Ulrich" <Ulrich.C...@aeb.de>
Subject Indexer behaves differently while starting the repository?
Date Wed, 16 Mar 2011 10:14:02 GMT
Hello,

I have some "funny" problem with the Jackrabbit-Indexer mechanism. Are store text files with
a file size between 2 und 170MB. The Xmx is set to 850, and the indexer works without problem
when storing the stream in the jackrabbit datastore (FileDataStore). Till here is everything
ok.
But if I delete the workspace-index directory to let Jackrabbit restore it when starting the
next time, the indexer starts, works some files and then creates an java.lang.OutOfMemoryError:
Java heap space.

Can someone tell me, where the difference is between "indexing while storing" and "(re)indexing
while startup the repository"?

Thank you very much for any hint,
Best regards,
Ulrich

I appended the stacktrace here:

2011-03-16 11:04:36,916 WARN : [LazyTextExtractorField] Failed to extract text from a binary
property
java.lang.OutOfMemoryError: Java heap space
            at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
            at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
            at java.lang.StringBuilder.append(StringBuilder.java:190)
            at org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.characters(LazyTextExtractorField.java:191)
            at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
            at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:153)
            at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
            at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
            at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
            at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
            at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:39)
            at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)
            at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)
            at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:151)
            at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:261)
            at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:132)
            at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
            at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
            at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
            at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
            at org.apache.jackrabbit.core.query.lucene.JackrabbitParser.parse(JackrabbitParser.java:192)
            at org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.run(LazyTextExtractorField.java:174)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
            at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
            at java.util.concurrent.FutureTask.run(FutureTask.java:123)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:65)
            at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:168)
            at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
            at java.lang.Thread.run(Thread.java:595)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message