Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: solr-user@lucene.apache.org
Received-SPF: softfail (athena.apache.org: transitioning domain of
 erickerickson@gmail.com does not designate 54.191.145.13 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAPNVcMipvPjaMHEnjrp0XanqAfqHK16-3KKhk0Qbwjm5kz-Z4A@mail.gmail.com>
References: 
 <CABMwgrXMhKVwW0LnVa3zByx12mCfo_Y-gopx3ocNqM8R1JQ_yg@mail.gmail.com>
	<CAN4YXvf+m=2cjJjaqCFROoo2vsHcj+NYpeGLp_AjERa2-cH8DQ@mail.gmail.com>
	<CAPNVcMipvPjaMHEnjrp0XanqAfqHK16-3KKhk0Qbwjm5kz-Z4A@mail.gmail.com>
Date: Wed, 6 May 2015 07:54:26 -0700
Message-ID: 
 <CAN4YXvc3kiw4RyOKu6-wOK7Mn-DBAYLd=ZmnuOtJ4NjVTpT9uw@mail.gmail.com>
Subject: Re: Solr cloud clusterstate.json update query ?
From: Erick Erickson <erickerickson@gmail.com>
To: solr-user@lucene.apache.org
Content-Type: text/plain; charset=UTF-8

Gopal:

Did you see my previous answer?

Best,
Erick

On Tue, May 5, 2015 at 9:42 PM, Gopal Jee <zgopal@gmail.com> wrote:
> about <2> , live_nodes under zookeeper is ephemeral node (please see
> zookeeper ephemeral node). So, once connection from solr zkClient to
> zookeeper is lost, these nodes will disappear automatically. AFAIK,
> clusterstate.json is updated by overseer based on messages published to a
> queue in zookeeper by solr zkclients. In case, solr node dies ungracefully,
> I am not sure how this event is updated in clusterstate.json.
> *Can someone shed some light *on ungraceful solr shutdown and consequent
> status update in clusterstate. I guess there would be some ay, because all
> nodes in a cluster decides clusterstate based on watched clusterstate.json
> node. They will not be watching live_nodes for updating their state.
>
> Gopal
>
> On Wed, May 6, 2015 at 6:33 AM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> about <1>. This shouldn't be happening, so I wouldn't concentrate
>> there first. The most common reason is that you have a short Zookeeper
>> timeout and the replicas go into a stop-the-world garbage collection
>> that exceeds the timeout. So the first thing to do is to see if that's
>> happening. Here are a couple of good places to start:
>>
>> http://lucidworks.com/blog/garbage-collection-bootcamp-1-0/
>> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr
>>
>> <2> Partial answer is that ZK does a keep-alive type thing and if the
>> solr nodes it knows about don't reply, it marks the nodes as down.
>>
>> Best,
>> Erick
>>
>> On Tue, May 5, 2015 at 5:42 AM, Sai Sreenivas K <sai.k@myntra.com> wrote:
>> > Could you clarify on the following questions,
>> > 1. Is there a way to avoid all the nodes simultaneously getting into
>> > recovery state when a bulk indexing happens ? Is there an api to disable
>> > replication on one node for a while ?
>> >
>> > 2. We recently changed the host name on nodes in solr.xml. But the old
>> host
>> > entries still exist in the clusterstate.json marked as active state.
>> Though
>> > live_nodes has the correct information. Who updates clusterstate.json if
>> > the node goes down in an ungraceful fashion without notifying its down
>> > state ?
>> >
>> > Thanks,
>> > Sai Sreenivas K
>>
>
>
>
> --