lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudarsan, Sithu D." <Sithu.Sudar...@fda.hhs.gov>
Subject RE: Multi -threaded indexing of large number of PDF documents
Date Fri, 14 Nov 2008 16:57:35 GMT
 
Hi All,

Based on your valuable inputs, we tried a few experiments with number of
threads. The observation is, if the number of threads are one less than
the number of cores (we have 'main' as a separate thread. Essentially,
including 'main' number of threads equal to number of cores), the
indexing performance reaches the optimum level, with maximum CPU
utilization. We have tried with Windows XP as well as CentOS. 

As regards to number of documents to be indexed and size of the
documents, there seems to be some correlation  between the two. But we
are yet to ascertain the same.

At this point, we are writing to a single IndexWriter. Have not tried
comparing using multiple writers and merge them to benchmark the
performance.

Sincerely,
Sithu D Sudarsan

sithu.sudarsan@fda.hhs.gov
sdsudarsan@ualr.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message