lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: SolrCloud Admin UI shows node is Down, but state.json says it's active/up
Date Wed, 09 Sep 2015 15:42:53 GMT
Perhaps there is something preventing clean shutdown. Shutdown makes a best
effort attempt to publish DOWN for all the local cores.

Otherwise, yes, it's a little bit annoying, but full state is a combination
of the state entry and whether the live node for that replica exists or not.

- Mark

On Wed, Sep 9, 2015 at 1:50 AM Arcadius Ahouansou <arcadius@menelic.com>
wrote:

> Thank you Tomás for pointing to the JavaDoc
>
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/common/cloud/Replica.State.html#ACTIVE
>
> The Javadoc is quite clear. So this stale state.json is not an issue after
> all.
>
> However, it's very confusing that when a node goes down, state.json may be
> updated for 1 collection while it remains stale in the other collection.
> Also in our case, the node did not crash as per the JavaDoc... it was a
> normal server stop/shut-down.
> We may need to review our shut-down process and see whether things change.
>
> Thank you very much Erick and Tomás for your valuable help... very
> appreciated.
>
> Arcadius.
>
>
> On 8 September 2015 at 18:28, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
> > bq: You were probably referring to state.json
> >
> > yep, I'm never sure whether people are on the old or new ZK versions.
> >
> > OK, With Tomás' comment, I think it's explained... although confusing.
> >
> > WDYT?
> >
> >
> > On Tue, Sep 8, 2015 at 10:03 AM, Arcadius Ahouansou
> > <arcadius@menelic.com> wrote:
> > > Hello Erick.
> > >
> > > Yes,
> > >
> > > 1> liveNodes has N nodes listed (correctly): Correct, liveNodes is
> always
> > > right.
> > >
> > > 2> clusterstate.json has N+M nodes listed as "active":
> clusterstate.json
> > is
> > > always empty as it's no longer being "used" in 5.3. You were
> > > probably referring to state.json which is in individual collections.
> Yes,
> > > that one reflects the wrong value i.e N+M
> > >
> > > 3> using the collection API to get CLUSTERSTATUS always return the
> > correct
> > > value N
> > >
> > > 4> The Front-end code in code in cloud.js displays the right colour
> when
> > > nodes go down because it checks for the live node
> > >
> > > The problem is only with state.json under certain circumstances.
> > >
> > > Thanks.
> > >
> > > On 8 September 2015 at 17:51, Erick Erickson <erickerickson@gmail.com>
> > > wrote:
> > >
> > >> Arcadius:
> > >>
> > >> Hmmm. It may take a while for the cluster state to change, but I'm
> > >> assuming that this state persists for minutes/hours/days.
> > >>
> > >> So to recap: If dump the entire ZK node from the root, you have
> > >> 1> liveNodes has N nodes listed (correctly)
> > >> 2> clusterstate.json has N+M nodes listed as "active"
> > >>
> > >> Doesn't sound right to me, but I'll have to let people who are deep
> > >> into that code speculate from here.
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Tue, Sep 8, 2015 at 1:13 AM, Arcadius Ahouansou <
> > arcadius@menelic.com>
> > >> wrote:
> > >> > On Sep 8, 2015 6:25 AM, "Erick Erickson" <erickerickson@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> Perhaps the browser cache? What happens if you, say, use
> > >> >> Zookeeper client tools to bring down the the cluster state in
> > >> >> question? Or perhaps just refresh the admin UI when showing
> > >> >> the cluster status....
> > >> >>
> > >> >
> > >> > Hello Erick.
> > >> >
> > >> > Thank you very much for answering.
> > >> > I did use the ZooInspetor tool to check the state.json in all 5 zk
> > nodes
> > >> > and they are all out of date and identical to what I get through the
> > tree
> > >> > view in sole admin ui.
> > >> >
> > >> > Looking at the source code cloud.js that correctly display nodes as
> > >> "gone"
> > >> > in the graph view, it calls the end point /zookeeper?wt=json and
> > relies
> > >> on
> > >> > the live nodes to mark a node as down instead of status.json.
> > >> >
> > >> > Thanks.
> > >> >
> > >> >> Shot in the dark,
> > >> >> Erick
> > >> >>
> > >> >> On Mon, Sep 7, 2015 at 6:09 PM, Arcadius Ahouansou <
> > >> arcadius@menelic.com>
> > >> > wrote:
> > >> >> > We are running the latest Solr 5.3.0
> > >> >> >
> > >> >> > Thanks.
> > >>
> > >
> > >
> > >
> > > --
> > > Arcadius Ahouansou
> > > Menelic Ltd | Information is Power
> > > M: 07908761999
> > > W: www.menelic.com
> > > ---
> >
>
>
>
> --
> Arcadius Ahouansou
> Menelic Ltd | Information is Power
> M: 07908761999
> W: www.menelic.com
> ---
>
-- 
- Mark
about.me/markrmiller

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message