lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <>
Subject code improvement / easier optimization
Date Fri, 02 Nov 2007 18:40:08 GMT
The Lucene 2.2 code for managing buffers is somewhat "ugly" - the  
passing of the buffer size parameter around.

I changed this in my branch to use the BufferSizes class below.

I changed the BufferedIndexInput/Output class like this

class BufferedIndexInput {
	private int bufferSize = BufferSizes.getReadBufferSize();
class BufferedIndexOutput {
	private int bufferSize = BufferSizes.getWriteBufferSize();

then in IndexWriter I surround the code that creates the  
SegmentReaders with:

try {
      ... create segment readers ...
} finally {

I think this is much cleaner. It also allows for other optimizations  

query engine detects a phrase query, so it increase the the buffers  
prior to reading the terms
query result has a lot of matches, so increase the buffer size when  
reading the documents

Seems a lot easier to manage. It also allows playing with various  
buffer sizes very easily.

I have been able to get the optimize time down from 3.5 minutes to  
1.5 minutes on the exact same index (using all of the recent  
enhancements - much of the improvement is related to the larger  
buffer sizes used in Lucene 2.2).


public class BufferSizes {
     private static ThreadLocal useMergeBuffers = new ThreadLocal(){};

     public static int getReadBufferSize() {
	return (Boolean.TRUE.equals(useMergeBuffers.get())) ? 16384*2 : 1024;
     public static int getWriteBufferSize() {
	return 16384*2;
      * cause the current thread to use buffers sized for segment  
merging. always use try/finally to reset the value
     public static void useMergeBuffers() {
      * cause the current thread to use buffers sized for normal  
index operations
     public static void useNormalBuffers() {

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message