lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harald Kirsch <Harald.Kir...@raytion.com>
Subject Directory flushing / commit / openIfChanged
Date Mon, 06 Aug 2012 11:22:53 GMT
Hi,

in my application I have to write tons of small documents to the index, 
but with a twist. Many of the documents are actually aggregations of 
pieces of information that appear in a data stream, usually close 
together, but nevertheless merged with information for other documents.

When information a1 for my document A arrives, I create my A-object, 
store it with index.addDocument() and forget about it. Later, when a2 
arrives, I fetch A from the index, delete it from the index, update it, 
and store its updated version. To fetch it from the index, I use a 
reader retrieved with IndexReader.openIfChanged(). So for one piece of 
information I have roughly the following sequence:

   get searcher via IndexReader.openIfChanged()
   find previously stored document, if any
   if document already available {
     update document object
     index.deleteDocument(new Term(IDFIELD, id))
   } else {
     create document object
   }
   index.addDocument()


The overall speed is not too bad, but I wonder if more is possible. I 
changed RAMBufferSizeMB from the default 16 to 200 but saw no 
improvement in speed.

I would think that keeping documents in RAM for some time such that many 
updates happen in RAM, rather then being written to disk would improve 
the overall running time.

Any hints how to configure and use Lucene to improve the speed without 
layering my own caching on top of it?

Harald.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message