lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Lecarme <math...@garambrogne.net>
Subject Re: Optimise Indexing time using lucene..
Date Wed, 09 Apr 2008 07:53:19 GMT
lucene4varma a écrit :
> Hi all,
>
> I am new to lucene and am using it for text search in my web application,
> and for that i need to index records in database.
> We are using jdbc directory to store the indexes. Now the problem is when is
> start the process of indexing the records for the first time it is taking
> huge amount of time. Following is the code for indexing. 
>
> rs = st.executequery(); // returns 2 million records
> while(rs.next()) {
>     create java object .............;
>     index java record into JDBC directory...;
> }
>
> The above process takes me huge amount of time for 2 million records.
> Approximately it is taking 3-4 business days to run the process. 
> Can any one please suggest me and approach by which i could cut down this
> time.
>   
jdbc directory is not a good idea. It's only useful when you need 
central repository.
Use large maxBufferedDocs in your IndexWriter.
With large amount of data, you'll get bottleneck : database reading, 
index writing, RAM for buffered docs, maybe CPU.
If your database reading is huge, and you are hurry, you can shard the 
index between multiple computer, and when it's finished, merge all the 
index, with champain.

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message