lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Is housekeeping of Lucene indexes block index update but allow search ?
Date Mon, 04 Aug 2014 16:59:21 GMT
Right.
1> Occasionally the merge will require 2x the disk space. (3x in compound
file system). The merging is, indeed, done in the background, it is NOT a
blocking operation.

2> n/a. It shouldn't block at all.

Here's a cool video by Mike McCandless on the merging process, plus some
explanations:

http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Best,
Erick




On Mon, Aug 4, 2014 at 8:45 AM, Kumaran R <kums.134@gmail.com> wrote:

> Hi Gaurav
>
> 1.When you opened index to write,till you close that index, there will
> be a lock to do further write. But not for search. During merge, index
> needs 3X ( not sure 2X?) of more storage space, i believe that is the
> reason for no blocking for search. ( any other experts can clarify you
> more on this )
>
> 2. Merge will be taken care by default values( merge factor 2) of
> lucene. If u need to control more on merge policy, please go through
> about merge by size or by number of segments or many merge policies.
>
>
> Hope this will help you a little bit.
>
> --
> Kumaran R
> Sent from Phone
>
> > On 04-Aug-2014, at 8:04 pm, Gaurav gupta <gupta.gaurav0125@gmail.com>
> wrote:
> >
> > Hi,
> >
> > We are planning to use Lucene 4.8.1 over Oracle (1 to 2 TB data) and
> > seeking information on  "How Lucene conduct housekeeping or maintenance
> of
> > indexes over a period of time". *Is it a blocking operation for write and
> > search or it will not block anything while merging is going on? *
> >
> > I found :- *"Since Lucene adds the updated document to the index and
> marks
> > all previous versions as deleted. So to get rid of deleted documents
> Lucene
> > needs to do some housekeeping over a period of time. Under the hood is
> that
> > from time to time segments are merged into (usually) bigger segments
> > using configurable MergePolicy
> > <
> http://lucene.apache.org/java/3_4_0/api/core/org/apache/lucene/index/MergePolicy
> >
> > (TieredMergePolicy).
> > "*
> >
> > 1- Is it's a blocking operation for write and search both or it will not
> > block anything while merging is going on?
> >
> > 2- What is the best practice to avoid any blocking in production servers?
> > Not sure how Solr or Elasticsearch is handling it.
> > Should we control the merging by calling *forcemerge(int) at low traffic
> > time *to avoid any unpredictable blocking operation? Is it recommended or
> > Lucene do intelligent merging and don't block anything (updates and
> > searches) or there are ways to reduce the blocking time to a very small
> > duration (1 -2 minutes) using some API or demon thread etc.
> >
> > Looking for your professional guidance on it.
> >
> > Regards
> > Gaurav
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message