kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ale...@apache.org
Subject [2/3] kudu git commit: [tools] Don't count tombstoned replicas in `kudu remote_replica check`
Date Tue, 20 Nov 2018 17:05:50 GMT
[tools] Don't count tombstoned replicas in `kudu remote_replica check`

Counting them as not running made the tool pretty useless on any
realistic cluster, where it's expected to have some tombstones. For
example, it wasn't possible to use this tool to tell when all the tablet
replicas on a tablet server had finished bootstrapping.

I tested this against a server that previously failed the check because
it had about 6000/10000 replicas tombstoned in STOPPED and INITIALIZED
states. It now passes the check.

Change-Id: Ibbdde768cce5f1d27064b7276944a585b41a2f03
Reviewed-on: http://gerrit.cloudera.org:8080/11962
Reviewed-by: Andrew Wong <awong@cloudera.com>
Tested-by: Kudu Jenkins
Reviewed-by: Attila Bukor <abukor@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/77c729d5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/77c729d5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/77c729d5

Branch: refs/heads/master
Commit: 77c729d58360d56de7b2bdab9304b867faee02a8
Parents: b8e6498
Author: Will Berkeley <wdberkeley@gmail.org>
Authored: Tue Nov 20 00:37:00 2018 -0800
Committer: Will Berkeley <wdberkeley@gmail.com>
Committed: Tue Nov 20 10:12:53 2018 +0000

----------------------------------------------------------------------
 src/kudu/tablet/metadata.proto               |  2 +-
 src/kudu/tools/tool_action_remote_replica.cc | 13 +++++++++----
 2 files changed, 10 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/77c729d5/src/kudu/tablet/metadata.proto
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/metadata.proto b/src/kudu/tablet/metadata.proto
index 0c3110d..8905249 100644
--- a/src/kudu/tablet/metadata.proto
+++ b/src/kudu/tablet/metadata.proto
@@ -165,7 +165,7 @@ enum TabletStatePB {
   // state Peers are available for client RPCs.
   RUNNING = 1;
 
-  // The tablet failed to for some reason. TabletReplica::error() will return
+  // The tablet failed for some reason. TabletReplica::error() will return
   // the reason for the failure.
   FAILED = 2;
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/77c729d5/src/kudu/tools/tool_action_remote_replica.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/tool_action_remote_replica.cc b/src/kudu/tools/tool_action_remote_replica.cc
index a62d695..9ffb195 100644
--- a/src/kudu/tools/tool_action_remote_replica.cc
+++ b/src/kudu/tools/tool_action_remote_replica.cc
@@ -200,7 +200,9 @@ Status CheckReplicas(const RunnerContext& context) {
   bool all_running = true;
   for (const auto& r : replicas) {
     const TabletStatusPB& rs = r.tablet_status();
-    if (rs.state() != tablet::RUNNING) {
+    // It's ok if the tablet replica isn't running if it's tombstoned.
+    if (rs.state() != tablet::RUNNING &&
+        rs.tablet_data_state() != tablet::TABLET_DATA_TOMBSTONED) {
       cerr << "Tablet id: " << rs.tablet_id() << " is "
            << tablet::TabletStatePB_Name(rs.state()) << endl;
       all_running = false;
@@ -208,10 +210,10 @@ Status CheckReplicas(const RunnerContext& context) {
   }
 
   if (all_running) {
-    cout << "All tablets are running" << endl;
+    cout << "All tablet replicas are running" << endl;
     return Status::OK();
   }
-  return Status::IllegalState("Not all tablets are running");
+  return Status::IllegalState("Not all tablet replicas are running");
 }
 
 Status DeleteReplica(const RunnerContext& context) {
@@ -394,7 +396,10 @@ Status UnsafeChangeConfig(const RunnerContext& context) {
 unique_ptr<Mode> BuildRemoteReplicaMode() {
   unique_ptr<Action> check_replicas =
       ActionBuilder("check", &CheckReplicas)
-      .Description("Check if all tablet replicas on a Kudu Tablet Server are running")
+      .Description("Check if all tablet replicas on a Kudu tablet server are "
+                   "running. Tombstoned replica do not count as not running, "
+                   "because they are just records of the previous existence of "
+                   "a replica.")
       .AddRequiredParameter({ kTServerAddressArg, kTServerAddressDesc })
       .Build();
 


Mime
View raw message