lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mekin Maheshwari" <meki...@gmail.com>
Subject Indexing slows down considerably after a few million documents
Date Fri, 27 Oct 2006 05:39:17 GMT
I am creating an index of about 7 Million documents.
The total size of the index is about 2.7G once indexing is done.

For the 1st 3Million documents, the indexer takes about 3 hours (can i
get better than this? )
 - 4 seconds per thousand documents

After this it slows down terribly and takes about 20 seconds for every
thousand documents.


It doesnt seem to be a data issue, as I tried starting to create the
index from the 3Millionth document & I get the initial speed (1k docs
in 3 secs)

What could be going wrong?
I have tried a few things, but cant quickly try out a lot of things as
it takes 3 hours before the slow down happens.

Below is the relevant pieces of the code::

Thanks for the help,
mekin

IndexWriter writer = new IndexWriter(dirname, analyzer,doTotalIndex );
writer.setMergeFactor(1000);

while(more records to get){
 Document doc = new Document();
//... add fields to doc
//add boost to doc
//
 writer.addDocument(doc);
}

writer.optimize();
writer.close();

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message