Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 81827200CB6 for ; Thu, 29 Jun 2017 23:24:54 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7FE2C160BED; Thu, 29 Jun 2017 21:24:54 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78EF4160BC6 for ; Thu, 29 Jun 2017 23:24:53 +0200 (CEST) Received: (qmail 59034 invoked by uid 500); 29 Jun 2017 21:24:52 -0000 Mailing-List: contact commits-help@kudu.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kudu.apache.org Delivered-To: mailing list commits@kudu.apache.org Received: (qmail 59025 invoked by uid 99); 29 Jun 2017 21:24:52 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jun 2017 21:24:52 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 90EC7E2F58; Thu, 29 Jun 2017 21:24:52 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: mpercy@apache.org To: commits@kudu.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: kudu git commit: raft_consensus: failure detector doesn't need external synchronization Date: Thu, 29 Jun 2017 21:24:52 +0000 (UTC) archived-at: Thu, 29 Jun 2017 21:24:54 -0000 Repository: kudu Updated Branches: refs/heads/master bf8e1c7b8 -> feaa4e36c raft_consensus: failure detector doesn't need external synchronization While working with TimerFailureDetector I saw that it's thread-safe through and through. Thus, RaftConsensus doesn't need to externally synchronize it. So I renamed some functions and removed a few lock acquisition sites. AFAICT, the only reason not to do this is for one of the overloads of SnoozeFailureDetector() which calls LOG_WITH_PREFIX_UNLOCKED(). I don't think that's useful enough to warrant the locking, though. Change-Id: I637030c3ae2470aa0a8e1067ac6a9b69ea1793b4 Reviewed-on: http://gerrit.cloudera.org:8080/7329 Reviewed-by: Todd Lipcon Tested-by: Kudu Jenkins Reviewed-by: Mike Percy Project: http://git-wip-us.apache.org/repos/asf/kudu/repo Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/feaa4e36 Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/feaa4e36 Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/feaa4e36 Branch: refs/heads/master Commit: feaa4e36c1b4d72e27a015c4ef519dc1f71c7984 Parents: bf8e1c7 Author: Adar Dembo Authored: Wed Jun 28 14:59:40 2017 -0700 Committer: Mike Percy Committed: Thu Jun 29 21:24:38 2017 +0000 ---------------------------------------------------------------------- src/kudu/consensus/raft_consensus.cc | 52 ++++++++++++++----------------- src/kudu/consensus/raft_consensus.h | 12 +++---- 2 files changed, 29 insertions(+), 35 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kudu/blob/feaa4e36/src/kudu/consensus/raft_consensus.cc ---------------------------------------------------------------------- diff --git a/src/kudu/consensus/raft_consensus.cc b/src/kudu/consensus/raft_consensus.cc index 49afc3c..c636ccf 100644 --- a/src/kudu/consensus/raft_consensus.cc +++ b/src/kudu/consensus/raft_consensus.cc @@ -328,7 +328,7 @@ Status RaftConsensus::Start(const ConsensusBootstrapInfo& info) { LockGuard l(lock_); RETURN_NOT_OK(CheckRunningUnlocked()); - RETURN_NOT_OK(EnsureFailureDetectorEnabledUnlocked()); + RETURN_NOT_OK(EnsureFailureDetectorEnabled()); // If this is the first term expire the FD immediately so that we have a fast first // election, otherwise we just let the timer expire normally. @@ -344,7 +344,7 @@ Status RaftConsensus::Start(const ConsensusBootstrapInfo& info) { LOG_WITH_PREFIX_UNLOCKED(INFO) << "Consensus starting up: Expiring failure detector timer " "to make a prompt election more likely"; } - RETURN_NOT_OK(ExpireFailureDetectorUnlocked()); + RETURN_NOT_OK(ExpireFailureDetector()); } // Now assume "follower" duties. @@ -432,7 +432,7 @@ Status RaftConsensus::StartElection(ElectionMode mode, ElectionReason reason) { return Status::OK(); } if (PREDICT_FALSE(active_role == RaftPeerPB::NON_PARTICIPANT)) { - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked()); // Reduce election noise while in this state. + RETURN_NOT_OK(SnoozeFailureDetector()); // Reduce election noise while in this state. return Status::IllegalState("Not starting election: Node is currently " "a non-participant in the raft config", SecureShortDebugString(cmeta_->ActiveConfig())); @@ -444,10 +444,10 @@ Status RaftConsensus::StartElection(ElectionMode mode, ElectionReason reason) { // Snooze to avoid the election timer firing again as much as possible. // We do not disable the election timer while running an election, so that // if the election times out, we will try again. - RETURN_NOT_OK(EnsureFailureDetectorEnabledUnlocked()); + RETURN_NOT_OK(EnsureFailureDetectorEnabled()); MonoDelta timeout = LeaderElectionExpBackoffDeltaUnlocked(); - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked(timeout, ALLOW_LOGGING)); + RETURN_NOT_OK(SnoozeFailureDetector(timeout, ALLOW_LOGGING)); // Increment the term and vote for ourselves, unless it's a pre-election. if (mode != PRE_ELECTION) { @@ -557,7 +557,7 @@ Status RaftConsensus::BecomeLeaderUnlocked() { LOG_WITH_PREFIX_UNLOCKED(INFO) << "Becoming Leader. State: " << ToStringUnlocked(); // Disable FD while we are leader. - RETURN_NOT_OK(EnsureFailureDetectorDisabledUnlocked()); + RETURN_NOT_OK(EnsureFailureDetectorDisabled()); // Don't vote for anyone if we're a leader. withhold_votes_until_ = MonoTime::Max(); @@ -592,7 +592,7 @@ Status RaftConsensus::BecomeReplicaUnlocked() { ClearLeaderUnlocked(); // FD should be running while we are a follower. - RETURN_NOT_OK(EnsureFailureDetectorEnabledUnlocked()); + RETURN_NOT_OK(EnsureFailureDetectorEnabled()); // Now that we're a replica, we can allow voting for other nodes. withhold_votes_until_ = MonoTime::Min(); @@ -1210,7 +1210,7 @@ Status RaftConsensus::UpdateReplica(const ConsensusRequestPB* request, // Snooze the failure detector as soon as we decide to accept the message. // We are guaranteed to be acting as a FOLLOWER at this point by the above // sanity check. - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked()); + RETURN_NOT_OK(SnoozeFailureDetector()); // We update the lag metrics here in addition to after appending to the queue so the // metrics get updated even when the operation is rejected. @@ -1385,9 +1385,7 @@ Status RaftConsensus::UpdateReplica(const ConsensusRequestPB* request, // If just waiting for our log append to finish lets snooze the timer. // We don't want to fire leader election because we're waiting on our own log. if (s.IsTimedOut()) { - ThreadRestrictions::AssertWaitAllowed(); - LockGuard l(lock_); - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked()); + RETURN_NOT_OK(SnoozeFailureDetector()); } } while (s.IsTimedOut()); RETURN_NOT_OK(s); @@ -1962,7 +1960,7 @@ Status RaftConsensus::RequestVoteRespondVoteGranted(const VoteRequestPB* request // persist our vote to disk. We use an exponential backoff to avoid too much // split-vote contention when nodes display high latencies. MonoDelta additional_backoff = LeaderElectionExpBackoffDeltaUnlocked(); - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked(additional_backoff, ALLOW_LOGGING)); + RETURN_NOT_OK(SnoozeFailureDetector(additional_backoff, ALLOW_LOGGING)); if (!request->is_pre_election()) { // Persist our vote to disk. @@ -1973,7 +1971,7 @@ Status RaftConsensus::RequestVoteRespondVoteGranted(const VoteRequestPB* request // Give peer time to become leader. Snooze one more time after persisting our // vote. When disk latency is high, this should help reduce churn. - RETURN_NOT_OK(SnoozeFailureDetectorUnlocked(additional_backoff, DO_NOT_LOG)); + RETURN_NOT_OK(SnoozeFailureDetector(additional_backoff, DO_NOT_LOG)); LOG(INFO) << Substitute("$0: Granting yes vote for candidate $1 in term $2.", GetRequestVoteLogPrefixUnlocked(*request), @@ -2122,8 +2120,8 @@ void RaftConsensus::DoElectionCallback(ElectionReason reason, const ElectionResu // finish another one is triggered already. // We ignore the status as we don't want to fail if we the timer is // disabled. - ignore_result(SnoozeFailureDetectorUnlocked(LeaderElectionExpBackoffDeltaUnlocked(), - ALLOW_LOGGING)); + ignore_result(SnoozeFailureDetector(LeaderElectionExpBackoffDeltaUnlocked(), + ALLOW_LOGGING)); if (result.decision == VOTE_DENIED) { failed_elections_since_stable_leader_++; @@ -2357,8 +2355,7 @@ void RaftConsensus::CompleteConfigChangeRoundUnlocked(ConsensusRound* round, con } -Status RaftConsensus::EnsureFailureDetectorEnabledUnlocked() { - DCHECK(lock_.is_locked()); +Status RaftConsensus::EnsureFailureDetectorEnabled() { if (PREDICT_FALSE(!FLAGS_enable_leader_failure_detection)) { return Status::OK(); } @@ -2371,8 +2368,7 @@ Status RaftConsensus::EnsureFailureDetectorEnabledUnlocked() { Bind(&RaftConsensus::ReportFailureDetected, Unretained(this))); } -Status RaftConsensus::EnsureFailureDetectorDisabledUnlocked() { - DCHECK(lock_.is_locked()); +Status RaftConsensus::EnsureFailureDetectorDisabled() { if (PREDICT_FALSE(!FLAGS_enable_leader_failure_detection)) { return Status::OK(); } @@ -2383,8 +2379,7 @@ Status RaftConsensus::EnsureFailureDetectorDisabledUnlocked() { return failure_detector_->UnTrack(kTimerId); } -Status RaftConsensus::ExpireFailureDetectorUnlocked() { - DCHECK(lock_.is_locked()); +Status RaftConsensus::ExpireFailureDetector() { if (PREDICT_FALSE(!FLAGS_enable_leader_failure_detection)) { return Status::OK(); } @@ -2392,14 +2387,12 @@ Status RaftConsensus::ExpireFailureDetectorUnlocked() { return failure_detector_->MessageFrom(kTimerId, MonoTime::Min()); } -Status RaftConsensus::SnoozeFailureDetectorUnlocked() { - DCHECK(lock_.is_locked()); - return SnoozeFailureDetectorUnlocked(MonoDelta::FromMicroseconds(0), DO_NOT_LOG); +Status RaftConsensus::SnoozeFailureDetector() { + return SnoozeFailureDetector(MonoDelta::FromMicroseconds(0), DO_NOT_LOG); } -Status RaftConsensus::SnoozeFailureDetectorUnlocked(const MonoDelta& additional_delta, - AllowLogging allow_logging) { - DCHECK(lock_.is_locked()); +Status RaftConsensus::SnoozeFailureDetector(const MonoDelta& additional_delta, + AllowLogging allow_logging) { if (PREDICT_FALSE(!FLAGS_enable_leader_failure_detection)) { return Status::OK(); } @@ -2407,8 +2400,9 @@ Status RaftConsensus::SnoozeFailureDetectorUnlocked(const MonoDelta& additional_ MonoTime time = MonoTime::Now() + additional_delta; if (allow_logging == ALLOW_LOGGING) { - LOG_WITH_PREFIX_UNLOCKED(INFO) << "Snoozing failure detection for election timeout " - << "plus an additional " + additional_delta.ToString(); + LOG(INFO) << LogPrefixThreadSafe() + << "Snoozing failure detection for election timeout plus an additional " + << additional_delta.ToString(); } return failure_detector_->MessageFrom(kTimerId, time); http://git-wip-us.apache.org/repos/asf/kudu/blob/feaa4e36/src/kudu/consensus/raft_consensus.h ---------------------------------------------------------------------- diff --git a/src/kudu/consensus/raft_consensus.h b/src/kudu/consensus/raft_consensus.h index 2d3e725..9fe1042 100644 --- a/src/kudu/consensus/raft_consensus.h +++ b/src/kudu/consensus/raft_consensus.h @@ -524,30 +524,30 @@ class RaftConsensus : public RefCountedThreadSafe, // Start tracking the leader for failures. This typically occurs at startup // and when the local peer steps down as leader. // If the failure detector is already registered, has no effect. - Status EnsureFailureDetectorEnabledUnlocked(); + Status EnsureFailureDetectorEnabled(); // Untrack the current leader from failure detector. // This typically happens when the local peer becomes leader. // If the failure detector is already unregistered, has no effect. - Status EnsureFailureDetectorDisabledUnlocked(); + Status EnsureFailureDetectorDisabled(); // Set the failure detector to an "expired" state, so that the next time // the failure monitor runs it triggers an election. // This is primarily intended to be used at startup time. - Status ExpireFailureDetectorUnlocked(); + Status ExpireFailureDetector(); // "Reset" the failure detector to indicate leader activity. // The failure detector must currently be enabled. // When this is called a failure is guaranteed not to be detected // before 'FLAGS_leader_failure_max_missed_heartbeat_periods' * // 'FLAGS_raft_heartbeat_interval_ms' has elapsed. - Status SnoozeFailureDetectorUnlocked() WARN_UNUSED_RESULT; + Status SnoozeFailureDetector() WARN_UNUSED_RESULT; // Like the above but adds 'additional_delta' to the default timeout // period. If allow_logging is set to ALLOW_LOGGING, then this method // will print a log message when called. - Status SnoozeFailureDetectorUnlocked(const MonoDelta& additional_delta, - AllowLogging allow_logging) WARN_UNUSED_RESULT; + Status SnoozeFailureDetector(const MonoDelta& additional_delta, + AllowLogging allow_logging) WARN_UNUSED_RESULT; // Return the minimum election timeout. Due to backoff and random // jitter, election timeouts may be longer than this.