lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shane <lucene-u...@my-family.us>
Subject Re: Indexing slows down considerably after a few million documents
Date Fri, 27 Oct 2006 14:15:29 GMT
Are you doing all 7 million docs with the same writer?  The call to 
optimize will take longer as your index size increases.  So if you are 
actually indexing your docs in smaller chunks, the speed will decrease 
due to the call to optimize.

Mekin Maheshwari wrote:
> I am creating an index of about 7 Million documents.
> The total size of the index is about 2.7G once indexing is done.
>
> For the 1st 3Million documents, the indexer takes about 3 hours (can i
> get better than this? )
> - 4 seconds per thousand documents
>
> After this it slows down terribly and takes about 20 seconds for every
> thousand documents.
>
>
> It doesnt seem to be a data issue, as I tried starting to create the
> index from the 3Millionth document & I get the initial speed (1k docs
> in 3 secs)
>
> What could be going wrong?
> I have tried a few things, but cant quickly try out a lot of things as
> it takes 3 hours before the slow down happens.
>
> Below is the relevant pieces of the code::
>
> Thanks for the help,
> mekin
>
> IndexWriter writer = new IndexWriter(dirname, analyzer,doTotalIndex );
> writer.setMergeFactor(1000);
>
> while(more records to get){
> Document doc = new Document();
> //... add fields to doc
> //add boost to doc
> //
> writer.addDocument(doc);
> }
>
> writer.optimize();
> writer.close();
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message