lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Performance Optimizations and Expected Benchmark Results
Date Mon, 30 Aug 2010 16:38:19 GMT
Jenny is correct.   Opening and closing is expensive.  The reason is that
most updates are memory-only, but closing an index forces writes to disk
which involves expensive serialization.

On Mon, Aug 30, 2010 at 8:42 AM, Jenny Brown <skywind@gmail.com> wrote:

> On Mon, Aug 30, 2010 at 2:53 AM, Ron Ratovsky <ronr@correlsense.com>
> wrote:
> > Hi Ted and Jenny,
> > Thanks for both your responses.
> > In regards to Jenny's question - the answer is yes. There's no problem
> > processing the objects in batches. I'd be interested to know why that
> would
> > affect performance.
>
> I'm not 100% confident on this, but in my experience, repeatedly
> opening and closing the index is the slow operation -- adding
> documents to it is not.  I get better performance by having a routine
> that runs every 5 minutes, and adds a batch of documents at once,
> rather than trying to add individual items as they come in via an
> irregularly timed stream.  Even if it ran once a minute, batching
> still gives me better results than individual items.
>
> I don't pretend to know why.  :)  It made sense when I developed the
> code but that was a few years ago.  I now only remember what worked,
> not the full explanation of why.
>
>
> Jenny
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message