lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: REBALANCELEADERS is not reliable
Date Thu, 20 Dec 2018 15:57:49 GMT
You can go here: https://issues.apache.org/jira, create a signon and
freely create JIRAs. Please attach the patch as well. I hadn't really
thought very carefully about REBALANCELEADERS and the new replica
types, but that does change the use-case.

Best,
Erick

On Thu, Dec 20, 2018 at 3:31 AM Vadim Ivanov
<vadim.ivanov@spb.ntk-intourist.ru> wrote:
>
> Yes! It works!
> I have tested RebalanceLeaders today with the patch provided by Endika Posadas. (http://lucene.472066.n3.nabble.com/Rebalance-Leaders-Leader-node-deleted-when-rebalancing-leaders-td4417040.html)
> And at last it works as expected on my collection with 5 nodes and about 400 shards.
> Original patch was slightly incompatible with 7.6.0
> I hope this patch will help to try this feature with 7.6
> https://drive.google.com/file/d/19z_MPjxItGyghTjXr6zTCVsiSJg1tN20
>
> RebalanceLeaders was not very useful feature before 7.0 (as all replicas were NRT)
> But new replica types made it very helpful to keep big clusters in order...
>
> I wonder, why there is no any jira about this case (or maybe I missed it)?
> Anyone who cares, please, help to create jira and improve this feature in the nearest
releaase
> --
> Vadim
>
> > -----Original Message-----
> > From: Vadim Ivanov [mailto:vadim.ivanov@spb.ntk-intourist.ru]
> > Sent: Friday, December 07, 2018 6:13 PM
> > To: solr-user@lucene.apache.org
> > Subject: RE: REBALANCELEADERS is not reliable
> >
> > I'm waiting for 7.6 or 7.5.1 and plan to apply patch from  Endika Posadas to it.
> > Then test again and hope it'll help
> > --
> > Vadim
> >
> >
> > > -----Original Message-----
> > > From: Bernd Fehling [mailto:bernd.fehling@uni-bielefeld.de]
> > > Sent: Friday, December 07, 2018 12:01 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: REBALANCELEADERS is not reliable
> > >
> > > Thanks for looking this up.
> > > It could be a hint where to jump into the code.
> > > I wonder why they rejected a jira ticket about this problem?
> > >
> > > Regards, Bernd
> > >
> > > Am 06.12.18 um 16:31 schrieb Vadim Ivanov:
> > > > Is solr-dev forum I came across this post
> > > > http://lucene.472066.n3.nabble.com/Rebalance-Leaders-Leader-node-
> > > deleted-when-rebalancing-leaders-td4417040.html
> > > > May be it will shed some light?
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: Atita Arora [mailto:atitaarora@gmail.com]
> > > >> Sent: Thursday, November 29, 2018 11:03 PM
> > > >> To: solr-user@lucene.apache.org
> > > >> Subject: Re: REBALANCELEADERS is not reliable
> > > >>
> > > >> Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for
me as well,
> > > >> even with the preferredLeader property as recommended in the
> > > >> documentation.
> > > >> I handled it with a little hack but certainly this dint work as expected.
> > > >> I can provide more details if there's a ticket.
> > > >>
> > > >> On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon
> > > >> <amantandon.10@gmail.com> wrote:
> > > >>
> > > >>> ++ correction
> > > >>>
> > > >>> On Fri, Nov 30, 2018, 01:10 Aman Tandon <amantandon.10@gmail.com
> > > >> wrote:
> > > >>>
> > > >>>> For me today, I deleted the leader replica of one of the two
shard
> > > >>>> collection. Then other replicas of that shard wasn't getting
elected for
> > > >>>> leader.
> > > >>>>
> > > >>>> After waiting for long tried the setting addreplicaprop preferred
leader
> > > >>>> on one of the replica then tried FORCELEADER but no luck.
Then also
> > > tried
> > > >>>> rebalance but no help. Finally have to recreate the whole
collection.
> > > >>>>
> > > >>>> Not sure what was the issue but both FORCELEADER AND
> > REBALANCING
> > > >> didn't
> > > >>>> work if there was no leader however preferred leader property
was
> > > setted.
> > > >>>>
> > > >>>> On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
> > > >>> bernd.fehling@uni-bielefeld.de
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Vadim,
> > > >>>>>
> > > >>>>> thanks for confirming.
> > > >>>>> So it seems to be a general problem with Solr 6.x, 7.x
and might
> > > >>>>> be still there in the most recent versions.
> > > >>>>>
> > > >>>>> But where to start to debug this problem, is it something
not
> > > >>>>> correctly stored in zookeeper or is overseer the problem?
> > > >>>>>
> > > >>>>> I was also reading something about a "leader queue" where
possible
> > > >>>>> leaders have to be requeued or something similar.
> > > >>>>>
> > > >>>>> May be I should try to get a situation where a "locked"
core
> > > >>>>> is on the overseer and then connect the debugger to it
and step
> > > >>>>> through it.
> > > >>>>> Peeking and poking around, like old Commodore 64 days
:-)
> > > >>>>>
> > > >>>>> Regards, Bernd
> > > >>>>>
> > > >>>>>
> > > >>>>> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> > > >>>>>> Hi, Bernd
> > > >>>>>> I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> > > >>>>>> I had very similar results and notion that it's not
reliable :(
> > > >>>>>> --
> > > >>>>>> Br, Vadim
> > > >>>>>>
> > > >>>>>>> -----Original Message-----
> > > >>>>>>> From: Bernd Fehling [mailto:bernd.fehling@uni-bielefeld.de]
> > > >>>>>>> Sent: Tuesday, November 27, 2018 5:13 PM
> > > >>>>>>> To: solr-user@lucene.apache.org
> > > >>>>>>> Subject: REBALANCELEADERS is not reliable
> > > >>>>>>>
> > > >>>>>>> Hi list,
> > > >>>>>>>
> > > >>>>>>> unfortunately REBALANCELEADERS is not reliable
and the leader
> > > >>>>>>> election has unpredictable results with SolrCloud
6.6.5 and
> > > >>>>>>> Zookeeper 3.4.10.
> > > >>>>>>> Seen with 5 shards / 3 replicas.
> > > >>>>>>>
> > > >>>>>>> - CLUSTERSTATUS reports all replicas (core_nodes)
as state=active.
> > > >>>>>>> - setting with ADDREPLICAPROP the property preferredLeader
to
> > > other
> > > >>>>> replicas
> > > >>>>>>> - calling REBALANCELEADERS
> > > >>>>>>> - some leaders have changed, some not.
> > > >>>>>>>
> > > >>>>>>> I then tried:
> > > >>>>>>> - removing all preferredLeader properties from
replicas which
> > > >>>>> succeeded.
> > > >>>>>>> - trying again REBALANCELEADERS for the rest.
No success.
> > > >>>>>>> - Shutting down nodes to force the leader to a
specific replica left
> > > >>>>> running.
> > > >>>>>>>    No success.
> > > >>>>>>> - calling REBALANCELEADERS responds that the replica
is inactive!!!
> > > >>>>>>> - calling CLUSTERSTATUS reports that the replica
is active!!!
> > > >>>>>>>
> > > >>>>>>> Also, the replica which don't want to become leader
is not in the
> > > >>> list
> > > >>>>>>> of collections->[collection_name]->leader_elect->shard1..x-
> > >election
> > > >>>>>>>
> > > >>>>>>> Where is CLUSTERSTATUS getting it's state info
from?
> > > >>>>>>>
> > > >>>>>>> Has anyone else problems with REBALANCELEADERS?
> > > >>>>>>>
> > > >>>>>>> I noticed that the Reference Guide writes "preferredLeader"
(with
> > > >>>>> capital "L")
> > > >>>>>>> but the JAVA code has "preferredleader".
> > > >>>>>>>
> > > >>>>>>> Regards, Bernd
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >
>

Mime
View raw message