lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-java Wiki] Update of "ImproveIndexingSpeed" by MikeMcCandless
Date Sat, 09 Jun 2007 15:54:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The following page has been changed by MikeMcCandless:
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed

------------------------------------------------------------------------------
  
   Modern hardware is highly concurrent (multi-core CPUs, multi-channel memory archiectures,
native command queueing in hard drives, etc.) so using more than one thread to add documents
can give good gains overall.  Even on older machines there is often still concurrency to be
gained between IO and CPU.  Test the number of threads to find the best performance point.
  
-   * Index on separate indices then merge.
+  * '''Index on separate indices then merge.'''
  
+  If you have a very large amount of content to index then you can break your content into
N "silos", index each silo on a separate machine, then use the writer.addIndexesNoOptimize
to merge them all into one final index.
-     If you have a very large amount of content to index then you can
-     break your content into N "silos", index each silo on a separate
-     machine, then use the writer.addIndexesNoOptimize to merge them
-     all into one final index.
  
-   * Use a faster machine, especially a fast IO system.
+  * '''Use a faster machine, especially a faster IO system.'''
  
-   * Run a Java profiler.
+  * '''Run a Java profiler.'''
  
+  If all else fails, profile your application to figure out where the time is going.  I've
had success with a very simple profiler called <a href="http://www.khelekore.org/jmp/">JMP</a>.
 There are many others.  Often you will be pleasantly surprised to find some silly, unexpected
method is taking far too much time.
-     If all else fails, profile your application to figure out where
-     the time is going.  I've had success with a very simple profiler
-     called <a href="http://www.khelekore.org/jmp/">JMP</a>.  There are
-     many others.  Often you will be pleasantly surprised to find some
-     silly, unexpected method is taking far too much time.
  

Mime
View raw message