zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com.INVALID>
Subject Re: Discover LEADER from JMX
Date Thu, 21 Jun 2018 15:13:46 GMT
Okay, let's put it this way: what kind of test would you like to add here?

*Unit test*
This is a requirement for the patch to be committed. You don't need to fire
up a full blown ensemble to test the new functionality. Pick the class that
you'd like to test, mock out the dependencies with moquito and write some
quick unit tests. Your tests should finish in a few hundreds millisecs (1-2
secs maximum).

*Integration test*
This is not required, but more than welcome. Can you do it with reasonable
amount of effort? Is it going to finish in an acceptable time frame (5-10
secs)? If yes, go ahead, if not, skip that.

Hope that helps.

Regards,
Andor






On Thu, Jun 21, 2018 at 2:39 PM, Enrico Olivelli <eolivelli@gmail.com>
wrote:

> Andor,
>
> I had already tried that way but without success.
> I have updated the patch with a proposal.
> Thanks
>
> Enrico
>
> Il gio 21 giu 2018, 13:20 Andor Molnar <andor@cloudera.com.invalid> ha
> scritto:
>
> > Hi Enrico!
> >
> > Take a look at testElectionFraud(), it might be useful to you.
> >
> > Regards,
> > Andor
> >
> >
> >
> > On Thu, Jun 21, 2018 at 12:00 PM, Enrico Olivelli <eolivelli@gmail.com>
> > wrote:
> >
> > > Norbert,
> > > thank you for taking a look
> > >
> > > Il giorno gio 21 giu 2018 alle ore 11:58 Norbert Kalmar
> > > <nkalmar@cloudera.com.invalid> ha scritto:
> > >
> > > > Good question. Wouldn't just killing the leader do the job?
> > > >
> > >
> > > It seems to me that is it not "immediate" to "kill" a QuorumPeer inside
> > > such test.
> > > If you "shutdown" it you cannot 'start' it anymore.
> > >
> > > This is why I am asking for help.
> > >
> > > -- Enrico
> > >
> > >
> > >
> > > >
> > > > Regards,
> > > > Norbert
> > > >
> > > > On Thu, Jun 21, 2018 at 3:51 AM Enrico Olivelli <eolivelli@gmail.com
> >
> > > > wrote:
> > > >
> > > > > This is my patch
> > > > > https://github.com/apache/zookeeper/pull/546
> > > > >
> > > > > One question:
> > > > > I would like to force the cluster to change leader in
> > > > > HierarchicalQuorumTest, so that I can test that JMX will reflect
> the
> > > new
> > > > > status of the group.
> > > > > Any idea about how to bounce the leader ?
> > > > >
> > > > > Cheers
> > > > > Enrico
> > > > >
> > > > > Il giorno mer 20 giu 2018 alle ore 13:45 Enrico Olivelli <
> > > > > eolivelli@gmail.com> ha scritto:
> > > > >
> > > > > > This is my JIRA
> > > > > > I am going to work on a patch
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3066
> > > > > >
> > > > > > Enrico
> > > > > >
> > > > > > Il gio 10 mag 2018, 19:47 Andor Molnar <andor@cloudera.com>
ha
> > > > scritto:
> > > > > >
> > > > > >> "in order to guess which is the leader I have to ask to
all of
> the
> > > > three
> > > > > >> nodes in the cluster"
> > > > > >>
> > > > > >> That's correct.
> > > > > >>
> > > > > >> Regards,
> > > > > >> Andor
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Thu, May 10, 2018 at 4:07 AM, Enrico Olivelli <
> > > eolivelli@gmail.com
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Il giorno mer 9 mag 2018 alle ore 20:24 Patrick Hunt
<
> > > > > phunt@apache.org>
> > > > > >> ha
> > > > > >> > scritto:
> > > > > >> >
> > > > > >> > > iiuc what you are interested in the information
is already
> > > > > available.
> > > > > >> The
> > > > > >> > > beans have a "state" attribute which indicates
following vs
> > > > leading.
> > > > > >> > >
> > > > > >> > > Try attaching a jconsole to the running servers,
use the
> > > "mbeans"
> > > > > tab
> > > > > >> and
> > > > > >> > > open org.apache.ZooKeeperService -> replicatedserver
->
> > replica
> > > ->
> > > > > >> > > attributes, you'll see the "state" attribute there.
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >> > Patrick,
> > > > > >> > I can't find this information.
> > > > > >> > If I log into a "follower" I get this info only for
the
> 'current
> > > > > >> replica'
> > > > > >> >
> > > > > >> > Example, I have three peers,  the first one is a "Follower",
> on
> > > JMX
> > > > I
> > > > > >> have
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > > > > ClientAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > > > > >> ElectionAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > > > LearnerType
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> Name
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > > > > QuorumAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > State =
> > > > > >> > following
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_1
-
> > > > > ConfigVersion
> > > > > >> > .....
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > for other peers I see only
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_2
-
> > > > > ClientAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_2
-
> > > > > >> ElectionAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_2
-
> > > > LearnerType
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_2
-
> Name
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_2
-
> > > > > QuorumAddress
> > > > > >> >
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_3
-
> > > > > ClientAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_3
-
> > > > > >> ElectionAddress
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_3
-
> > > > LearnerType
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_3
-
> Name
> > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 - replica_3
-
> > > > > QuorumAddress
> > > > > >> >
> > > > > >> >
> > > > > >> > so quering only this server I cannot guess which is
the
> current
> > > > > leader,
> > > > > >> the
> > > > > >> > only information I can extract is:
> > > > > >> > - I am a follower
> > > > > >> > - We are a cluster of three
> > > > > >> > - Every of the three is a PARTECIPANT (no observers)
> > > > > >> >
> > > > > >> > in order to guess which is the leader I have to ask
to all of
> > the
> > > > > three
> > > > > >> > nodes in the cluster
> > > > > >> >
> > > > > >> > Am I missing something ? I am running 3.5.3-BETA
> > > > > >> >
> > > > > >> > Enrico
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >> > >
> > > > > >> > > Patrick
> > > > > >> > >
> > > > > >> > > On Wed, May 9, 2018 at 8:02 AM, Enrico Olivelli
<
> > > > > eolivelli@gmail.com>
> > > > > >> > > wrote:
> > > > > >> > >
> > > > > >> > > > Thank you Edward
> > > > > >> > > >
> > > > > >> > > > I will pack all together and send out a patch
as soon as I
> > > have
> > > > > >> time.
> > > > > >> > > > I am running 3.5 in production and given
than an RC for
> > 3.5.4
> > > is
> > > > > >> going
> > > > > >> > to
> > > > > >> > > > be cut soon I will have to wait for 3.5.5
and I assume it
> > > won't
> > > > be
> > > > > >> > > > immediate.
> > > > > >> > > >
> > > > > >> > > > Cheers
> > > > > >> > > > Enrico
> > > > > >> > > >
> > > > > >> > > > Il giorno mer 9 mag 2018 alle ore 14:37 Edward
Ribeiro <
> > > > > >> > > > edward.ribeiro@gmail.com> ha scritto:
> > > > > >> > > >
> > > > > >> > > > > Sent before finishing the previous email.
Only to
> > > complement,
> > > > > the
> > > > > >> > > > > findLeader() could have been as below,
but this change
> is
> > > > only a
> > > > > >> > nitty
> > > > > >> > > > > detail and totally irrelevant to the
questions you are
> > > asking.
> > > > > :)
> > > > > >> > > > >
> > > > > >> > > > > /**
> > > > > >> > > > >  * Returns the address of the node we
think is the
> leader.
> > > > > >> > > > >  */
> > > > > >> > > > > protected QuorumServer findLeader()
{
> > > > > >> > > > >
> > > > > >> > > > >     // Find the leader by id
> > > > > >> > > > >     long currentLeader = self.getCurrentVote().getId();
> > > > > >> > > > >
> > > > > >> > > > >     QuorumServer leaderServer =
> > > > > self.getView().get(currentLeader);
> > > > > >> > > > >
> > > > > >> > > > >     if (leaderServer == null) {
> > > > > >> > > > >         LOG.warn("Couldn't find the
leader with id =
> {}",
> > > > > >> > > currentLeader);
> > > > > >> > > > >     }
> > > > > >> > > > >     return leaderServer;
> > > > > >> > > > > }
> > > > > >> > > > >
> > > > > >> > > > > Edward
> > > > > >> > > > >
> > > > > >> > > > > On Wed, May 9, 2018 at 9:29 AM, Edward
Ribeiro <
> > > > > >> > > edward.ribeiro@gmail.com
> > > > > >> > > > >
> > > > > >> > > > > wrote:
> > > > > >> > > > >
> > > > > >> > > > > > Hi Enrico,
> > > > > >> > > > > >
> > > > > >> > > > > > Well, I am not an expert on QuorumPeer
either (not an
> > > expert
> > > > > on
> > > > > >> > > > anything,
> > > > > >> > > > > > really), but maybe it's the variable
and method below?
> > > > > >> > > > > >
> > > > > >> > > > > > ----------------- QuorumPeer ------------------
> > > > > >> > > > > >
> > > > > >> > > > > > /**
> > > > > >> > > > > >  * This is who I think the leader
currently is.
> > > > > >> > > > > >  */
> > > > > >> > > > > > volatile private Vote currentVote;
> > > > > >> > > > > >
> > > > > >> > > > > > public synchronized Vote getCurrentVote(){
> > > > > >> > > > > >     return currentVote;
> > > > > >> > > > > > }
> > > > > >> > > > > >
> > > > > >> > > > > > ---------------------------------------
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > Then it's a matter of calling
> > > > > >> quorumPeer.getCurrentVote().getId()
> > > > > >> > and
> > > > > >> > > > > > quorumPeer.getServerState()?
> > > > > >> > > > > >
> > > > > >> > > > > > Btw, the Learner class has this
handy method below
> (self
> > > is
> > > > a
> > > > > >> > > > > QuorumPeer):
> > > > > >> > > > > >
> > > > > >> > > > > > ---------------- Learner --------------------
> > > > > >> > > > > >
> > > > > >> > > > > > /**
> > > > > >> > > > > >  * Returns the address of the node
we think is the
> > leader.
> > > > > >> > > > > >  */
> > > > > >> > > > > > protected QuorumServer findLeader()
{
> > > > > >> > > > > >     QuorumServer leaderServer =
null;
> > > > > >> > > > > >     // Find the leader by id
> > > > > >> > > > > >     Vote current = self.getCurrentVote();
> > > > > >> > > > > >     for (QuorumServer s : self.getView().values())
{
> > > > > >> > > > > >         if (s.id == current.getId())
{
> > > > > >> > > > > >             leaderServer = s;
> > > > > >> > > > > >             break;
> > > > > >> > > > > >         }
> > > > > >> > > > > >     }
> > > > > >> > > > > >     if (leaderServer == null) {
> > > > > >> > > > > >         LOG.warn("Couldn't find
the leader with id = "
> > > > > >> > > > > >                 + current.getId());
> > > > > >> > > > > >     }
> > > > > >> > > > > >     return leaderServer;
> > > > > >> > > > > > }
> > > > > >> > > > > >
> > > > > >> > > > > > ---------------------------------------
> > > > > >> > > > > >
> > > > > >> > > > > > By the way, as a side note, the
map traversal could be
> > > > changed
> > > > > >> by:
> > > > > >> > > > > >
> > > > > >> > > > > > ----------------------------
> > > > > >> > > > > >
> > > > > >> > > > > > if (self.getView().contains(current.getId())
{
> > > > > >> > > > > >
> > > > > >> > > > > > }
> > > > > >> > > > > >
> > > > > >> > > > > > ---------------------------
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > You can see above this method the
quorumPeer.getView()
> > > > > returns a
> > > > > >> > > > Map<sid,
> > > > > >> > > > > > QuorumServer> as below:
> > > > > >> > > > > >
> > > > > >> > > > > > -------------QuorumPeer ---------
> > > > > >> > > > > >
> > > > > >> > > > > > /**
> > > > > >> > > > > >  * A 'view' is a node's current
opinion of the
> > membership
> > > of
> > > > > the
> > > > > >> > > entire
> > > > > >> > > > > >  * ensemble.
> > > > > >> > > > > >  */
> > > > > >> > > > > > public Map<Long,QuorumPeer.QuorumServer>
getView() {
> > > > > >> > > > > >     return Collections.unmodifiableMap(
> > > getQuorumVerifier().
> > > > > >> > > > > > getAllMembers());
> > > > > >> > > > > > }
> > > > > >> > > > > >
> > > > > >> > > > > > -----------------------------
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > And then it retrieves the QuorumServer
that has many
> > more
> > > > > >> > information
> > > > > >> > > > > > about the node besides the sid
(InetSocketAddress,
> > > hostname,
> > > > > >> etc).
> > > > > >> > :)
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > Cheers,
> > > > > >> > > > > > Edward
> > > > > >> > > > > >
> > > > > >> > > > > > On Wed, May 9, 2018 at 8:50 AM,
Enrico Olivelli <
> > > > > >> > eolivelli@gmail.com
> > > > > >> > > >
> > > > > >> > > > > > wrote:
> > > > > >> > > > > >
> > > > > >> > > > > >> So I am trying to create a
patch in order to expose
> on
> > > JMX
> > > > > the
> > > > > >> id
> > > > > >> > of
> > > > > >> > > > the
> > > > > >> > > > > >> current "leader" (on the JVM
of a follower)
> > > > > >> > > > > >>
> > > > > >> > > > > >> I am trying to find in ZK which
is the variable which
> > > holds
> > > > > >> the ID
> > > > > >> > > of
> > > > > >> > > > > the
> > > > > >> > > > > >> current leader.
> > > > > >> > > > > >> I am new to the internal of
QuorumPeer
> > > > > >> > > > > >>
> > > > > >> > > > > >> Can someone give me some hint
?
> > > > > >> > > > > >>
> > > > > >> > > > > >> Enrico
> > > > > >> > > > > >>
> > > > > >> > > > > >> Il giorno mar 8 mag 2018 alle
ore 10:08 Ansel
> > Zandegran <
> > > > > >> > > > > >> Ansel.Zandegran@infor.com>
ha scritto:
> > > > > >> > > > > >>
> > > > > >> > > > > >> > Hi,
> > > > > >> > > > > >> > That is possible with
4 letter commands. We are
> using
> > > it
> > > > > >> now. In
> > > > > >> > > > 3.5.x
> > > > > >> > > > > >> it
> > > > > >> > > > > >> > is going to be removed
in favour of admin server
> > > > (embedded
> > > > > >> web
> > > > > >> > > > > server).
> > > > > >> > > > > >> > We are running in an environment
where it’s not
> > > possible
> > > > to
> > > > > >> run
> > > > > >> > > JMX
> > > > > >> > > > or
> > > > > >> > > > > >> > embedded web servers.
> > > > > >> > > > > >> >
> > > > > >> > > > > >> > So I am wondering if there
is another way? It would
> > be
> > > > nice
> > > > > >> to
> > > > > >> > > have
> > > > > >> > > > > this
> > > > > >> > > > > >> > info as a znode.
> > > > > >> > > > > >> >
> > > > > >> > > > > >> > Best regards,
> > > > > >> > > > > >> > Ansel
> > > > > >> > > > > >> >
> > > > > >> > > > > >> > > On 8 May 2018, at
09:55, Flavio Junqueira <
> > > > > fpj@apache.org>
> > > > > >> > > wrote:
> > > > > >> > > > > >> > >
> > > > > >> > > > > >> > > Hi Enrico,
> > > > > >> > > > > >> > >
> > > > > >> > > > > >> > > You can determine
the state of a server it via
> > > 4-letter
> > > > > >> > > commands.
> > > > > >> > > > > >> Would
> > > > > >> > > > > >> > that work for you?
> > > > > >> > > > > >> > >
> > > > > >> > > > > >> > > -Flavio
> > > > > >> > > > > >> > >
> > > > > >> > > > > >> > >> On 8 May 2018,
at 09:09, Enrico Olivelli <
> > > > > >> > eolivelli@gmail.com>
> > > > > >> > > > > >> wrote:
> > > > > >> > > > > >> > >>
> > > > > >> > > > > >> > >> Hi,
> > > > > >> > > > > >> > >> is there any
way to see in JMX which is the
> leader
> > > of
> > > > a
> > > > > >> > > ZooKeeper
> > > > > >> > > > > >> > cluster?
> > > > > >> > > > > >> > >>
> > > > > >> > > > > >> > >> My problem is:
given access to any of the nodes
> of
> > > the
> > > > > >> > cluster
> > > > > >> > > I
> > > > > >> > > > > >> want to
> > > > > >> > > > > >> > >> know from JMX
which is the current leader.
> > > > > >> > > > > >> > >> It seems to me
that this information is not
> > > available,
> > > > > you
> > > > > >> > can
> > > > > >> > > > know
> > > > > >> > > > > >> > only if
> > > > > >> > > > > >> > >> the local node
is Leader or Follower.
> > > > > >> > > > > >> > >>
> > > > > >> > > > > >> > >> Cheers
> > > > > >> > > > > >> > >> Enrico
> > > > > >> > > > > >> > >
> > > > > >> > > > > >> >
> > > > > >> > > > > >> >
> > > > > >> > > > > >>
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > > --
> > > > > >
> > > > > >
> > > > > > -- Enrico Olivelli
> > > > > >
> > > > >
> > > >
> > >
> >
> --
>
>
> -- Enrico Olivelli
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message