lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Presence of uncommitted changes
Date Fri, 17 Jan 2014 13:50:14 GMT
You might want to look at the soft/hard commit options for insuring
data integrity .vs. latency.
Here's a blog on this topic at the Solr level, but all the Solr stuff
is realized at the Lucene level
eventually, so....

Although this is written with SolrCloud in mind, I don't _think_
there's any problem with
doing this on a regular Lucene index....


On Fri, Jan 17, 2014 at 8:34 AM, Michael McCandless
<> wrote:
> On Fri, Jan 17, 2014 at 7:42 AM, Mindaugas Žakšauskas <> wrote:
>> On Fri, Jan 17, 2014 at 12:13 PM, Michael McCandless
>>> Backing up, what is your app doing, that it so strongly relies on
>>> knowing whether commit() would do anything?  Usually, commit is
>>> something you call rarely, for "safety" purposes to ensure if the
>>> world comes crashing down, you'll have a known state in the index on
>>> restart.
>> We use quite conservative commit policy - commit almost every time
>> when a new document is added to the index (or updated/deleted) - hence
>> the need to know if commit() is necessary.
>> This might sound sub-optimal, but I think it is justifiable because in
>> our application the incoming data stream is not really intense: we
>> normally get just a handful of documents added in a minute. The
>> ability to see those newly added (updated, deleted) documents
>> instantly is far more important.
> Seeing newly added documents instantly (in search) is what
> near-real-time readers are for.
> Opening an NRT reader from a writer is far faster and less costly than
> doing a commit + reopen.
>> Committing often also gives extra security: in case if the system
>> crashes, we are pretty sure we haven't lost anything as rebuilding the
>> index can take days. We could, of course, reindex just the missing
>> documents but finding out what exactly is missing is not trivial.
> OK.  Committing should only be used for this purpose (ensuring the
> index is in a known state if the world comes crashing down).
> Still, committing after every document is rather insane: performance
> will be awful.  But since your app seems to be very low traffic, maybe
> it's OK in your case ...
> Mike McCandless
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message