lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael D. Curtin" <m...@curtin.com>
Subject Re: Does more memory help Lucene?
Date Mon, 12 Jun 2006 12:49:53 GMT
Nadav Har'El wrote:

> What I couldn't figure out how to use, however, was the abundant memory (2
> GB) that this machine has.
> 
> I tried playing with IndexWriter.setMaxBufferedDocs(), and noticed that
> there is no speed gain after I set it to 1000, at which point the running
> Lucene takes up just 70 MB of memory, or 140 MB for the two threads.
> 
> Is there a way for Lucene to make use of the much larger memory I have, to
> speed up the indexing process? Does having a huge memory somehow improve
> the speed of huge merges, for example?

It may not be a Lucene limit per se, but a JVM limit instead.  What are you 
using for the JVM's heap (via -Xms and -Xmx switches)?  For example, I often 
run with java -Xmx1000m to let the heap grow to a gigabyte, if necessary.

Might I also suggest that you not try to index all of this data in a single 
invocation of a Java program.  That is, index a portion, say 10GB at a time, 
and then use AddIndexes() later to bring them together.  Set your granularity 
based on the amount of time you can stand to do work over, when the seemingly 
inevitable problems crop up.  It sure would stink to be working on the 1000th 
GB of input data and have a power supply go out, and then have to start all 
the way over from the beginning!  Other checkpointing schemes are possible, if 
you have the time and inclination to be more clever, too ...

Good luck!

--MDC

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message