lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ryanb <ryanbl...@everlaw.com>
Subject OutOfMemoryError indexing large documents
Date Wed, 26 Nov 2014 00:39:20 GMT
Hello,

We use vanilla Lucene 4.9.0 in a 64 bit Linux OS. We sometimes need to index
large documents (100+ MB), but this results in extremely high memory usage,
to the point of OutOfMemoryError even with 17GB of heap. We allow up to 20
documents to be indexed simultaneously, but the text to be analyzed and
indexed is streamed, not loaded into memory all at once.

Any suggestions for how to troubleshoot or ideas about the problem are
greatly appreciated!

Some details about our setup (let me know what other information will help):
- Use MMapDirectory wrapped in a NRTCachingDirectory
- RamBufferSize 64MB
- No compund files
- We commit every 20 seconds

Thanks,
Ryan



--
View this message in context: http://lucene.472066.n3.nabble.com/OutOfMemoryError-indexing-large-documents-tp4170983.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message