zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enrico Olivelli <eolive...@gmail.com>
Subject Re: Discover LEADER from JMX
Date Thu, 21 Jun 2018 15:38:32 GMT
Il gio 21 giu 2018, 17:13 Andor Molnar <andor@cloudera.com.invalid> ha
scritto:

> Okay, let's put it this way: what kind of test would you like to add here?
>
> *Unit test*
> This is a requirement for the patch to be committed. You don't need to fire
> up a full blown ensemble to test the new functionality. Pick the class that
> you'd like to test, mock out the dependencies with moquito and write some
> quick unit tests. Your tests should finish in a few hundreds millisecs (1-2
> secs maximum).
>

This is achieved in my patch. There was already a test about JMX and
LocalPeerMXBean. I have enhanced it, by extending the test to cover my new
JMX attribute and, to cover even RemotePeerMXBean, which was not tested at
all.

One review comment asked me to perform assertions after a leadership change.

I did not touch ZK internals with this patch, I am only publishing internal
state to JMX endpoint.

Maybe what I am trying to do is not needed for the patch to be 'acceptable'.

I think it would be useful to add such test case, but not strictly needed.
Unfortunately current test utilities seem to able to do what requested.

I am able to add mockito based tests if requested.





> *Integration test*
> This is not required, but more than welcome. Can you do it with reasonable
> amount of effort? Is it going to finish in an acceptable time frame (5-10
> secs)? If yes, go ahead, if not, skip that.
>

Not needed. Maybe existing test case is already similar to an integration
test ( it brings up a cluster of 5 nodes).

Anyway my changes is not slowing down the full suite add I am only adding
an assertion to an existing old test.

I think the restart of the cluster it is not strictly needed and it will
may add a new flaky test.

Treating ZK as one of my projects, I think the existing JMX test corpus
should be refactored but given the little impact of this patch is it not
worth to spend much time.



Enrico





> Hope that helps.
>
> Regards,
> Andor
>
>
>
>
>
>
> On Thu, Jun 21, 2018 at 2:39 PM, Enrico Olivelli <eolivelli@gmail.com>
> wrote:
>
> > Andor,
> >
> > I had already tried that way but without success.
> > I have updated the patch with a proposal.
> > Thanks
> >
> > Enrico
> >
> > Il gio 21 giu 2018, 13:20 Andor Molnar <andor@cloudera.com.invalid> ha
> > scritto:
> >
> > > Hi Enrico!
> > >
> > > Take a look at testElectionFraud(), it might be useful to you.
> > >
> > > Regards,
> > > Andor
> > >
> > >
> > >
> > > On Thu, Jun 21, 2018 at 12:00 PM, Enrico Olivelli <eolivelli@gmail.com
> >
> > > wrote:
> > >
> > > > Norbert,
> > > > thank you for taking a look
> > > >
> > > > Il giorno gio 21 giu 2018 alle ore 11:58 Norbert Kalmar
> > > > <nkalmar@cloudera.com.invalid> ha scritto:
> > > >
> > > > > Good question. Wouldn't just killing the leader do the job?
> > > > >
> > > >
> > > > It seems to me that is it not "immediate" to "kill" a QuorumPeer
> inside
> > > > such test.
> > > > If you "shutdown" it you cannot 'start' it anymore.
> > > >
> > > > This is why I am asking for help.
> > > >
> > > > -- Enrico
> > > >
> > > >
> > > >
> > > > >
> > > > > Regards,
> > > > > Norbert
> > > > >
> > > > > On Thu, Jun 21, 2018 at 3:51 AM Enrico Olivelli <
> eolivelli@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > This is my patch
> > > > > > https://github.com/apache/zookeeper/pull/546
> > > > > >
> > > > > > One question:
> > > > > > I would like to force the cluster to change leader in
> > > > > > HierarchicalQuorumTest, so that I can test that JMX will reflect
> > the
> > > > new
> > > > > > status of the group.
> > > > > > Any idea about how to bounce the leader ?
> > > > > >
> > > > > > Cheers
> > > > > > Enrico
> > > > > >
> > > > > > Il giorno mer 20 giu 2018 alle ore 13:45 Enrico Olivelli <
> > > > > > eolivelli@gmail.com> ha scritto:
> > > > > >
> > > > > > > This is my JIRA
> > > > > > > I am going to work on a patch
> > > > > > >
> > > > > > > https://issues.apache.org/jira/browse/ZOOKEEPER-3066
> > > > > > >
> > > > > > > Enrico
> > > > > > >
> > > > > > > Il gio 10 mag 2018, 19:47 Andor Molnar <andor@cloudera.com>
ha
> > > > > scritto:
> > > > > > >
> > > > > > >> "in order to guess which is the leader I have to ask
to all of
> > the
> > > > > three
> > > > > > >> nodes in the cluster"
> > > > > > >>
> > > > > > >> That's correct.
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >> Andor
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Thu, May 10, 2018 at 4:07 AM, Enrico Olivelli <
> > > > eolivelli@gmail.com
> > > > > >
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Il giorno mer 9 mag 2018 alle ore 20:24 Patrick
Hunt <
> > > > > > phunt@apache.org>
> > > > > > >> ha
> > > > > > >> > scritto:
> > > > > > >> >
> > > > > > >> > > iiuc what you are interested in the information
is already
> > > > > > available.
> > > > > > >> The
> > > > > > >> > > beans have a "state" attribute which indicates
following
> vs
> > > > > leading.
> > > > > > >> > >
> > > > > > >> > > Try attaching a jconsole to the running servers,
use the
> > > > "mbeans"
> > > > > > tab
> > > > > > >> and
> > > > > > >> > > open org.apache.ZooKeeperService -> replicatedserver
->
> > > replica
> > > > ->
> > > > > > >> > > attributes, you'll see the "state" attribute
there.
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Patrick,
> > > > > > >> > I can't find this information.
> > > > > > >> > If I log into a "follower" I get this info only
for the
> > 'current
> > > > > > >> replica'
> > > > > > >> >
> > > > > > >> > Example, I have three peers,  the first one is
a "Follower",
> > on
> > > > JMX
> > > > > I
> > > > > > >> have
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > > > > ClientAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > > > > >> ElectionAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > > > LearnerType
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > Name
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > > > > QuorumAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > State =
> > > > > > >> > following
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_1 -
> > > > > > ConfigVersion
> > > > > > >> > .....
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > for other peers I see only
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_2 -
> > > > > > ClientAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_2 -
> > > > > > >> ElectionAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_2 -
> > > > > LearnerType
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_2 -
> > Name
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_2 -
> > > > > > QuorumAddress
> > > > > > >> >
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_3 -
> > > > > > ClientAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_3 -
> > > > > > >> ElectionAddress
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_3 -
> > > > > LearnerType
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_3 -
> > Name
> > > > > > >> > o.a.ZookKeeperService - ReplicatedServer_id1 -
replica_3 -
> > > > > > QuorumAddress
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > so quering only this server I cannot guess which
is the
> > current
> > > > > > leader,
> > > > > > >> the
> > > > > > >> > only information I can extract is:
> > > > > > >> > - I am a follower
> > > > > > >> > - We are a cluster of three
> > > > > > >> > - Every of the three is a PARTECIPANT (no observers)
> > > > > > >> >
> > > > > > >> > in order to guess which is the leader I have to
ask to all
> of
> > > the
> > > > > > three
> > > > > > >> > nodes in the cluster
> > > > > > >> >
> > > > > > >> > Am I missing something ? I am running 3.5.3-BETA
> > > > > > >> >
> > > > > > >> > Enrico
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > >
> > > > > > >> > > Patrick
> > > > > > >> > >
> > > > > > >> > > On Wed, May 9, 2018 at 8:02 AM, Enrico Olivelli
<
> > > > > > eolivelli@gmail.com>
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Thank you Edward
> > > > > > >> > > >
> > > > > > >> > > > I will pack all together and send out
a patch as soon
> as I
> > > > have
> > > > > > >> time.
> > > > > > >> > > > I am running 3.5 in production and given
than an RC for
> > > 3.5.4
> > > > is
> > > > > > >> going
> > > > > > >> > to
> > > > > > >> > > > be cut soon I will have to wait for
3.5.5 and I assume
> it
> > > > won't
> > > > > be
> > > > > > >> > > > immediate.
> > > > > > >> > > >
> > > > > > >> > > > Cheers
> > > > > > >> > > > Enrico
> > > > > > >> > > >
> > > > > > >> > > > Il giorno mer 9 mag 2018 alle ore 14:37
Edward Ribeiro <
> > > > > > >> > > > edward.ribeiro@gmail.com> ha scritto:
> > > > > > >> > > >
> > > > > > >> > > > > Sent before finishing the previous
email. Only to
> > > > complement,
> > > > > > the
> > > > > > >> > > > > findLeader() could have been as
below, but this change
> > is
> > > > > only a
> > > > > > >> > nitty
> > > > > > >> > > > > detail and totally irrelevant to
the questions you are
> > > > asking.
> > > > > > :)
> > > > > > >> > > > >
> > > > > > >> > > > > /**
> > > > > > >> > > > >  * Returns the address of the node
we think is the
> > leader.
> > > > > > >> > > > >  */
> > > > > > >> > > > > protected QuorumServer findLeader()
{
> > > > > > >> > > > >
> > > > > > >> > > > >     // Find the leader by id
> > > > > > >> > > > >     long currentLeader =
> self.getCurrentVote().getId();
> > > > > > >> > > > >
> > > > > > >> > > > >     QuorumServer leaderServer =
> > > > > > self.getView().get(currentLeader);
> > > > > > >> > > > >
> > > > > > >> > > > >     if (leaderServer == null) {
> > > > > > >> > > > >         LOG.warn("Couldn't find
the leader with id =
> > {}",
> > > > > > >> > > currentLeader);
> > > > > > >> > > > >     }
> > > > > > >> > > > >     return leaderServer;
> > > > > > >> > > > > }
> > > > > > >> > > > >
> > > > > > >> > > > > Edward
> > > > > > >> > > > >
> > > > > > >> > > > > On Wed, May 9, 2018 at 9:29 AM,
Edward Ribeiro <
> > > > > > >> > > edward.ribeiro@gmail.com
> > > > > > >> > > > >
> > > > > > >> > > > > wrote:
> > > > > > >> > > > >
> > > > > > >> > > > > > Hi Enrico,
> > > > > > >> > > > > >
> > > > > > >> > > > > > Well, I am not an expert on
QuorumPeer either (not
> an
> > > > expert
> > > > > > on
> > > > > > >> > > > anything,
> > > > > > >> > > > > > really), but maybe it's the
variable and method
> below?
> > > > > > >> > > > > >
> > > > > > >> > > > > > ----------------- QuorumPeer
------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > > /**
> > > > > > >> > > > > >  * This is who I think the
leader currently is.
> > > > > > >> > > > > >  */
> > > > > > >> > > > > > volatile private Vote currentVote;
> > > > > > >> > > > > >
> > > > > > >> > > > > > public synchronized Vote getCurrentVote(){
> > > > > > >> > > > > >     return currentVote;
> > > > > > >> > > > > > }
> > > > > > >> > > > > >
> > > > > > >> > > > > > ---------------------------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > Then it's a matter of calling
> > > > > > >> quorumPeer.getCurrentVote().getId()
> > > > > > >> > and
> > > > > > >> > > > > > quorumPeer.getServerState()?
> > > > > > >> > > > > >
> > > > > > >> > > > > > Btw, the Learner class has
this handy method below
> > (self
> > > > is
> > > > > a
> > > > > > >> > > > > QuorumPeer):
> > > > > > >> > > > > >
> > > > > > >> > > > > > ---------------- Learner --------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > > /**
> > > > > > >> > > > > >  * Returns the address of
the node we think is the
> > > leader.
> > > > > > >> > > > > >  */
> > > > > > >> > > > > > protected QuorumServer findLeader()
{
> > > > > > >> > > > > >     QuorumServer leaderServer
= null;
> > > > > > >> > > > > >     // Find the leader by
id
> > > > > > >> > > > > >     Vote current = self.getCurrentVote();
> > > > > > >> > > > > >     for (QuorumServer s :
self.getView().values()) {
> > > > > > >> > > > > >         if (s.id == current.getId())
{
> > > > > > >> > > > > >             leaderServer =
s;
> > > > > > >> > > > > >             break;
> > > > > > >> > > > > >         }
> > > > > > >> > > > > >     }
> > > > > > >> > > > > >     if (leaderServer == null)
{
> > > > > > >> > > > > >         LOG.warn("Couldn't
find the leader with id
> = "
> > > > > > >> > > > > >                 + current.getId());
> > > > > > >> > > > > >     }
> > > > > > >> > > > > >     return leaderServer;
> > > > > > >> > > > > > }
> > > > > > >> > > > > >
> > > > > > >> > > > > > ---------------------------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > > By the way, as a side note,
the map traversal could
> be
> > > > > changed
> > > > > > >> by:
> > > > > > >> > > > > >
> > > > > > >> > > > > > ----------------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > > if (self.getView().contains(current.getId())
{
> > > > > > >> > > > > >
> > > > > > >> > > > > > }
> > > > > > >> > > > > >
> > > > > > >> > > > > > ---------------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > You can see above this method
the
> quorumPeer.getView()
> > > > > > returns a
> > > > > > >> > > > Map<sid,
> > > > > > >> > > > > > QuorumServer> as below:
> > > > > > >> > > > > >
> > > > > > >> > > > > > -------------QuorumPeer ---------
> > > > > > >> > > > > >
> > > > > > >> > > > > > /**
> > > > > > >> > > > > >  * A 'view' is a node's current
opinion of the
> > > membership
> > > > of
> > > > > > the
> > > > > > >> > > entire
> > > > > > >> > > > > >  * ensemble.
> > > > > > >> > > > > >  */
> > > > > > >> > > > > > public Map<Long,QuorumPeer.QuorumServer>
getView() {
> > > > > > >> > > > > >     return Collections.unmodifiableMap(
> > > > getQuorumVerifier().
> > > > > > >> > > > > > getAllMembers());
> > > > > > >> > > > > > }
> > > > > > >> > > > > >
> > > > > > >> > > > > > -----------------------------
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > And then it retrieves the
QuorumServer that has many
> > > more
> > > > > > >> > information
> > > > > > >> > > > > > about the node besides the
sid (InetSocketAddress,
> > > > hostname,
> > > > > > >> etc).
> > > > > > >> > :)
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > > > Cheers,
> > > > > > >> > > > > > Edward
> > > > > > >> > > > > >
> > > > > > >> > > > > > On Wed, May 9, 2018 at 8:50
AM, Enrico Olivelli <
> > > > > > >> > eolivelli@gmail.com
> > > > > > >> > > >
> > > > > > >> > > > > > wrote:
> > > > > > >> > > > > >
> > > > > > >> > > > > >> So I am trying to create
a patch in order to expose
> > on
> > > > JMX
> > > > > > the
> > > > > > >> id
> > > > > > >> > of
> > > > > > >> > > > the
> > > > > > >> > > > > >> current "leader" (on the
JVM of a follower)
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> I am trying to find in
ZK which is the variable
> which
> > > > holds
> > > > > > >> the ID
> > > > > > >> > > of
> > > > > > >> > > > > the
> > > > > > >> > > > > >> current leader.
> > > > > > >> > > > > >> I am new to the internal
of QuorumPeer
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> Can someone give me some
hint ?
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> Enrico
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> Il giorno mar 8 mag 2018
alle ore 10:08 Ansel
> > > Zandegran <
> > > > > > >> > > > > >> Ansel.Zandegran@infor.com>
ha scritto:
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> > Hi,
> > > > > > >> > > > > >> > That is possible
with 4 letter commands. We are
> > using
> > > > it
> > > > > > >> now. In
> > > > > > >> > > > 3.5.x
> > > > > > >> > > > > >> it
> > > > > > >> > > > > >> > is going to be removed
in favour of admin server
> > > > > (embedded
> > > > > > >> web
> > > > > > >> > > > > server).
> > > > > > >> > > > > >> > We are running in
an environment where it’s not
> > > > possible
> > > > > to
> > > > > > >> run
> > > > > > >> > > JMX
> > > > > > >> > > > or
> > > > > > >> > > > > >> > embedded web servers.
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > So I am wondering
if there is another way? It
> would
> > > be
> > > > > nice
> > > > > > >> to
> > > > > > >> > > have
> > > > > > >> > > > > this
> > > > > > >> > > > > >> > info as a znode.
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > Best regards,
> > > > > > >> > > > > >> > Ansel
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > > On 8 May 2018,
at 09:55, Flavio Junqueira <
> > > > > > fpj@apache.org>
> > > > > > >> > > wrote:
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > Hi Enrico,
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > You can determine
the state of a server it via
> > > > 4-letter
> > > > > > >> > > commands.
> > > > > > >> > > > > >> Would
> > > > > > >> > > > > >> > that work for you?
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > -Flavio
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > >> On 8 May
2018, at 09:09, Enrico Olivelli <
> > > > > > >> > eolivelli@gmail.com>
> > > > > > >> > > > > >> wrote:
> > > > > > >> > > > > >> > >>
> > > > > > >> > > > > >> > >> Hi,
> > > > > > >> > > > > >> > >> is there
any way to see in JMX which is the
> > leader
> > > > of
> > > > > a
> > > > > > >> > > ZooKeeper
> > > > > > >> > > > > >> > cluster?
> > > > > > >> > > > > >> > >>
> > > > > > >> > > > > >> > >> My problem
is: given access to any of the
> nodes
> > of
> > > > the
> > > > > > >> > cluster
> > > > > > >> > > I
> > > > > > >> > > > > >> want to
> > > > > > >> > > > > >> > >> know from
JMX which is the current leader.
> > > > > > >> > > > > >> > >> It seems
to me that this information is not
> > > > available,
> > > > > > you
> > > > > > >> > can
> > > > > > >> > > > know
> > > > > > >> > > > > >> > only if
> > > > > > >> > > > > >> > >> the local
node is Leader or Follower.
> > > > > > >> > > > > >> > >>
> > > > > > >> > > > > >> > >> Cheers
> > > > > > >> > > > > >> > >> Enrico
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >>
> > > > > > >> > > > > >
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > > --
> > > > > > >
> > > > > > >
> > > > > > > -- Enrico Olivelli
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > --
> >
> >
> > -- Enrico Olivelli
> >
>
-- 


-- Enrico Olivelli

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message