lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Glen Newton <>
Subject Re: Performance tips when creating a large index from database.
Date Thu, 22 Oct 2009 12:52:28 GMT
You might want to consider using LuSql, which is a high performance,
multithreaded, well documented tool designed specifically for moving
data from a JDBC database into Lucene (you didn't say if it was a
JDBC-accessible db...)

Disclosure: I am the author of LuSql.

-Glen Newton

2009/10/22 Paul Taylor <>:
> I'm building a lucene index from a database, creating 1 about 1 million
> documents, unsuprisingly this takes quite a long time.
> I do this by sending a query  to the db over a range of ids , (10,000)
> records
> Add these results in Lucene
> Then get next 10,0000 and so on.
> When completed indexing I then call optimize()
> I also set  indexWriter.setMaxBufferedDocs(1000) and
>  indexWriter.setMergeFactor(3000) but don't fully understand these values.
> Each document contains about 10 small fields
> I'm looking for some ways to improve performance.
> This index writing is single threaded, is there a way I can multi-thread
> writing to the indexing ?
> I only call optimize() once at the end, is the best way to do it.
> I'm going to run a profiler over the code, but are there any rules of thumbs
> on the best values to set for MaxBufferedDocs and Mergefactor()
> thanks Paul
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message