lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Becker <>
Subject Re: parallel index building & searching multiple indexes
Date Mon, 11 Aug 2003 23:36:33 GMT
Kevin A. Burton wrote:

> Killeen, Tom wrote:
>> I am attempting to create approx 10 different Lucene indexes.  I'm 
>> trying to
>> create them at the same time by running multiple processes and each 
>> index is
>> written to a new directory.  Once I create more than one process - the
>> performance is very, very slow.   
> If they are IDE disks you should NOT try to build multiple indexes at 
> once since your disks will bottleneck.
> If you are using SCSI/Fibre disks you should be able to write out 
> multiple indexes at once but you should benchmark it to make sure that 
> it is in fact faster. 

It is a downloadable application, so I don't know what the disks are ;-) 
I guess a common scenario is that someone indexes local and network 
directories. Then the parallel execution probably doesn't hurt, it might 
actually help. Of course it would be smarter to index server-side, but 
that is not yet implemented.

For the little application we wrote the indexing speed doesn't seem too 
important anyway. And it allows updating the index which means that 
benchmarking is harder -- although it probably also means that accessing 
multiple locations on the same drive hurts more, since the CPU is most 
likely used less in comparison to the disks if a document is up-to-date. 
But I can live with it, so I rather keep it for now. Maybe one day we 
change from the threaded model we have now to a queued model, but I am 
not sure if our users would be happy with it, since it means they either 
have to wait for the just requested update until everything else is 
finished, or we would have to add controls for the processing order, 
which bloats the UI.


View raw message