lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Presence of uncommitted changes
Date Fri, 17 Jan 2014 13:50:14 GMT
You might want to look at the soft/hard commit options for insuring
data integrity .vs. latency.
Here's a blog on this topic at the Solr level, but all the Solr stuff
is realized at the Lucene level
eventually, so....

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Although this is written with SolrCloud in mind, I don't _think_
there's any problem with
doing this on a regular Lucene index....

Best,
Erick

On Fri, Jan 17, 2014 at 8:34 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
> On Fri, Jan 17, 2014 at 7:42 AM, Mindaugas Žakšauskas <mindas@gmail.com> wrote:
>> On Fri, Jan 17, 2014 at 12:13 PM, Michael McCandless
>>> Backing up, what is your app doing, that it so strongly relies on
>>> knowing whether commit() would do anything?  Usually, commit is
>>> something you call rarely, for "safety" purposes to ensure if the
>>> world comes crashing down, you'll have a known state in the index on
>>> restart.
>>
>> We use quite conservative commit policy - commit almost every time
>> when a new document is added to the index (or updated/deleted) - hence
>> the need to know if commit() is necessary.
>>
>> This might sound sub-optimal, but I think it is justifiable because in
>> our application the incoming data stream is not really intense: we
>> normally get just a handful of documents added in a minute. The
>> ability to see those newly added (updated, deleted) documents
>> instantly is far more important.
>
> Seeing newly added documents instantly (in search) is what
> near-real-time readers are for.
>
> Opening an NRT reader from a writer is far faster and less costly than
> doing a commit + reopen.
>
>> Committing often also gives extra security: in case if the system
>> crashes, we are pretty sure we haven't lost anything as rebuilding the
>> index can take days. We could, of course, reindex just the missing
>> documents but finding out what exactly is missing is not trivial.
>
> OK.  Committing should only be used for this purpose (ensuring the
> index is in a known state if the world comes crashing down).
>
> Still, committing after every document is rather insane: performance
> will be awful.  But since your app seems to be very low traffic, maybe
> it's OK in your case ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message