lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: commit frequency guideline?
Date Wed, 30 Nov 2016 14:17:49 GMT
What do you mean by "Lucene complain about too-many uncommitted docs"?
 Lucene does not really care how frequently you commit...

How frequently you commit is really your choice, i.e. what risk you
see of power loss / OS crash vs the cost (not just in CPU/IO work for
the computer, but in the users not seeing the recently indexed
documents for a while) of replaying those documents since the last
commit when power comes back.

Pushing durability back into the queue/channel can be a nice option
too, e.g. Kafka, so that your application doesn't need to keep track
of which docs were not yet committed.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 30, 2016 at 8:50 AM, Rob Audenaerde
<rob.audenaerde@gmail.com> wrote:
> Hi all,
>
> Currently we call commit() many times on our index (about 5M docs, where
> some 10.000-100.000 modifications during the day). The commit times
> typically get more expensive when the index grows, up to several seconds,
> so we want to reduce the number of calls.
>
> (Historically, we had Lucene complain about too-many uncommitted docs
> sometimes, so we went with the commit often approach.)
>
> What is a good strategy for calling commit? Fixed frequency? After X docs?
> Combination?
>
> I'm curious what is considered 'industry-standard'. Can you share some of
> your expercience?
>
> Thanks!
>
> -Rob

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message