kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From granthe...@apache.org
Subject [2/5] kudu git commit: [tools] Add locate row tool
Date Tue, 16 Oct 2018 19:56:26 GMT
[tools] Add locate row tool

Sometimes while debugging I find it frustrating that it's very difficult
to tell what tablet a particular row belongs to. This basic tool
provides a way to find out, by providing a simple interface that accepts
a primary key as a JSON array and will print out the tablet id of the
corresponding tablet, or an error if there is no such tablet.

For example, with a table created like

CREATE TABLE test (
  key0 STRING NOT NULL,
  key1 INT32 NOT NULL,
  PRIMARY KEY(key0, key1)
)

an invocation of the tool looks like

$ kudu table locate_row localhost:7053 test "[\"foo\", 2]"

The choice of a JSON array is a compromise between CSV, which is easiest
and fastest to type in the common case, and a verbose JSON object like
{ "key0" : "foo", "key1" : 2 }, which is most explicit but long and
difficult to type on the command line. A JSON array has the benefit of
having well-defined escaping rules and formatting while being almost as
easy to type out as CSV.

A note about tests: it's difficult to verify the answer of the tool
independently. However, the implementation is just scan tokens, so the
tablet-finding logic should be well-exercised by lots of other client
tests. Consequently, the tests focus on error cases and sanity checks.

Change-Id: Idcdcf10bfe6b9df686e86b7134e8634fc0efaac3
Reviewed-on: http://gerrit.cloudera.org:8080/11666
Reviewed-by: Alexey Serbin <aserbin@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/fa5a0db5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/fa5a0db5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/fa5a0db5

Branch: refs/heads/master
Commit: fa5a0db5a21c0f9d6554857ce6641b95b670f25e
Parents: e3b1e05
Author: Will Berkeley <wdberkeley@gmail.org>
Authored: Fri Oct 12 01:20:12 2018 -0700
Committer: Will Berkeley <wdberkeley@gmail.com>
Committed: Tue Oct 16 19:39:06 2018 +0000

----------------------------------------------------------------------
 src/kudu/tools/kudu-admin-test.cc   | 264 +++++++++++++++++++++++++++++++
 src/kudu/tools/tool_action_table.cc | 128 ++++++++++++++-
 2 files changed, 391 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/fa5a0db5/src/kudu/tools/kudu-admin-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/kudu-admin-test.cc b/src/kudu/tools/kudu-admin-test.cc
index 046e8db..edb0c4d 100644
--- a/src/kudu/tools/kudu-admin-test.cc
+++ b/src/kudu/tools/kudu-admin-test.cc
@@ -21,6 +21,7 @@
 #include <cstdio>
 #include <deque>
 #include <iterator>
+#include <limits>
 #include <memory>
 #include <ostream>
 #include <string>
@@ -48,6 +49,7 @@
 #include "kudu/gutil/gscoped_ptr.h"
 #include "kudu/gutil/map-util.h"
 #include "kudu/gutil/strings/split.h"
+#include "kudu/gutil/strings/strip.h"
 #include "kudu/gutil/strings/substitute.h"
 #include "kudu/integration-tests/cluster_itest_util.h"
 #include "kudu/integration-tests/cluster_verifier.h"
@@ -1478,5 +1480,267 @@ TEST_F(AdminCliTest, TestDescribeTable) {
       ")\n"
       "REPLICAS 1");
 }
+
+TEST_F(AdminCliTest, TestLocateRow) {
+  FLAGS_num_tablet_servers = 1;
+  FLAGS_num_replicas = 1;
+
+  NO_FATALS(BuildAndStart());
+
+  // Test an OK case. Not much going on here since the table has only one
+  // tablet, which covers the whole universe.
+  string stdout, stderr;
+  Status s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kTableId,
+    "[-1]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.ok()) << ToolRunInfo(s, stdout, stderr);
+
+  // Grab list of tablet_ids from any tserver and check the output.
+  vector<TServerDetails*> tservers;
+  vector<string> tablet_ids;
+  AppendValuesFromMap(tablet_servers_, &tservers);
+  ListRunningTabletIds(tservers.front(),
+                       MonoDelta::FromSeconds(30),
+                       &tablet_ids);
+  ASSERT_EQ(1, tablet_ids.size());
+  ASSERT_STR_CONTAINS(stdout, tablet_ids[0]);
+
+  // Test a couple of error cases.
+  // String instead of int.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kTableId,
+    "[\"foo\"]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError());
+  ASSERT_STR_CONTAINS(stderr, "unable to parse");
+
+  // Float instead of int.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kTableId,
+    "[1.2]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError());
+  ASSERT_STR_CONTAINS(stderr, "unable to parse");
+
+  // Overflow (recall the key is INT32).
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kTableId,
+    Substitute("[$0]", std::to_string(std::numeric_limits<int64_t>::max()))
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError());
+  ASSERT_STR_CONTAINS(stderr, "out of range");
+}
+
+TEST_F(AdminCliTest, TestLocateRowMore) {
+  FLAGS_num_tablet_servers = 1;
+  FLAGS_num_replicas = 1;
+
+  NO_FATALS(BuildAndStart());
+
+  // Make a complex schema with multiple columns in the primary key, hash and
+  // range partitioning, and non-covered ranges.
+  const string kAnotherTableId = "TestAnotherTable";
+  KuduSchema schema;
+
+  // Build the schema.
+  KuduSchemaBuilder builder;
+  builder.AddColumn("key_hash")->Type(KuduColumnSchema::STRING)->NotNull();
+  builder.AddColumn("key_range")->Type(KuduColumnSchema::INT32)->NotNull();
+  builder.SetPrimaryKey({ "key_hash", "key_range" });
+  ASSERT_OK(builder.Build(&schema));
+
+  // Set up partitioning and create the table.
+  unique_ptr<KuduPartialRow> lower_bound0(schema.NewRow());
+  ASSERT_OK(lower_bound0->SetInt32("key_range", 0));
+  unique_ptr<KuduPartialRow> upper_bound0(schema.NewRow());
+  ASSERT_OK(upper_bound0->SetInt32("key_range", 1));
+  unique_ptr<KuduPartialRow> lower_bound1(schema.NewRow());
+  ASSERT_OK(lower_bound1->SetInt32("key_range", 2));
+  unique_ptr<KuduPartialRow> upper_bound1(schema.NewRow());
+  ASSERT_OK(upper_bound1->SetInt32("key_range", 3));
+  unique_ptr<KuduTableCreator> table_creator(client_->NewTableCreator());
+  ASSERT_OK(table_creator->table_name(kAnotherTableId)
+           .schema(&schema)
+           .add_hash_partitions({ "key_hash" }, 2)
+           .set_range_partition_columns({ "key_range" })
+           .add_range_partition(lower_bound0.release(), upper_bound0.release())
+           .add_range_partition(lower_bound1.release(), upper_bound1.release())
+           .num_replicas(FLAGS_num_replicas)
+           .Create());
+
+  vector<TServerDetails*> tservers;
+  vector<string> tablet_ids;
+  AppendValuesFromMap(tablet_servers_, &tservers);
+  ListRunningTabletIds(tservers.front(),
+                       MonoDelta::FromSeconds(30),
+                       &tablet_ids);
+  std::unordered_set<string> tablet_id_set(tablet_ids.begin(), tablet_ids.end());
+
+  // Since there isn't a great alternative way to validate the answer the tool
+  // gives, and the scan token code underlying the implementation is extensively
+  // tested, we won't overexert ourselves checking correctness, and instead just
+  // do sanity checks and tool usability checks.
+  string stdout, stderr;
+  Status s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",0]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.ok()) << ToolRunInfo(s, stdout, stderr);
+  StripWhiteSpace(&stdout);
+  const auto tablet_id_for_0 = stdout;
+  ASSERT_TRUE(ContainsKey(tablet_id_set, tablet_id_for_0))
+      << "expected to find tablet id " << tablet_id_for_0;
+
+  // A row in a different range partition should be in a different tablet.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",2]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.ok()) << ToolRunInfo(s, stdout, stderr);
+  StripWhiteSpace(&stdout);
+  const auto tablet_id_for_2 = stdout;
+  ASSERT_TRUE(ContainsKey(tablet_id_set, tablet_id_for_0))
+      << "expected to find tablet id " << tablet_id_for_2;
+  ASSERT_NE(tablet_id_for_0, tablet_id_for_2);
+
+  // Test locating a row lying in a non-covered range.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",1]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(stderr, "row does not belong to any currently existing tablet");
+
+  // Test providing a missing or incomplete primary key.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(
+      stderr,
+      "wrong number of key columns specified: expected 2 but received 0");
+
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\"]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(
+      stderr,
+      "wrong number of key columns specified: expected 2 but received 1");
+
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(
+      stderr,
+      "JSON text is corrupt");
+
+  // Test providing too many key column values.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",2,\"bar\"]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(
+      stderr,
+      "wrong number of key columns specified: expected 2 but received 3");
+
+  // Test providing an invalid value for a key column when there's multiple
+  // key columns.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "[\"foo\",\"bar\"]"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(stderr, "unable to parse");
+
+  // Test providing bad json.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "["
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(stderr, "JSON text is corrupt");
+
+  // Test providing valid JSON that's not an array.
+  stdout.clear();
+  stderr.clear();
+  s = RunKuduTool({
+    "table",
+    "locate_row",
+    cluster_->master()->bound_rpc_addr().ToString(),
+    kAnotherTableId,
+    "{ \"key_hash\" : \"foo\", \"key_range\" : 2 }"
+  }, &stdout, &stderr);
+  ASSERT_TRUE(s.IsRuntimeError()) << ToolRunInfo(s, stdout, stderr);
+  ASSERT_STR_CONTAINS(stderr,
+                      "Wrong type during field extraction: expected object array");
+}
 } // namespace tools
 } // namespace kudu

http://git-wip-us.apache.org/repos/asf/kudu/blob/fa5a0db5/src/kudu/tools/tool_action_table.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/tool_action_table.cc b/src/kudu/tools/tool_action_table.cc
index 33d6eb7..30ce6a6 100644
--- a/src/kudu/tools/tool_action_table.cc
+++ b/src/kudu/tools/tool_action_table.cc
@@ -25,12 +25,15 @@
 
 #include <gflags/gflags.h>
 #include <gflags/gflags_declare.h>
+#include <rapidjson/document.h>
 
 #include "kudu/client/client-test-util.h"
 #include "kudu/client/client.h"
 #include "kudu/client/replica_controller-internal.h"
+#include "kudu/client/scan_predicate.h"
 #include "kudu/client/schema.h"
 #include "kudu/client/shared_ptr.h"
+#include "kudu/client/value.h"
 #include "kudu/common/partition.h"
 #include "kudu/common/schema.h"
 #include "kudu/gutil/map-util.h"
@@ -39,6 +42,7 @@
 #include "kudu/gutil/strings/substitute.h"
 #include "kudu/tools/tool_action.h"
 #include "kudu/tools/tool_action_common.h"
+#include "kudu/util/jsonreader.h"
 #include "kudu/util/status.h"
 
 DECLARE_string(tables);
@@ -53,6 +57,8 @@ namespace tools {
 
 using client::KuduClient;
 using client::KuduClientBuilder;
+using client::KuduColumnSchema;
+using client::KuduPredicate;
 using client::KuduScanToken;
 using client::KuduScanTokenBuilder;
 using client::KuduTable;
@@ -64,6 +70,7 @@ using std::string;
 using std::unique_ptr;
 using std::vector;
 using strings::Split;
+using strings::Substitute;
 
 // This class only exists so that ListTables() can easily be friended by
 // KuduReplica, KuduReplica::Data, and KuduClientBuilder.
@@ -98,7 +105,7 @@ class TableLister {
         for (const auto* replica : token->tablet().replicas()) {
           const bool is_voter = ReplicaController::is_voter(*replica);
           const bool is_leader = replica->is_leader();
-          cout << strings::Substitute("    $0 $1 $2:$3",
+          cout << Substitute("    $0 $1 $2:$3",
               is_leader ? "L" : (is_voter ? "V" : "N"), replica->ts().uuid(),
               replica->ts().hostname(), replica->ts().port()) << endl;
         }
@@ -116,6 +123,7 @@ const char* const kTableNameArg = "table_name";
 const char* const kNewTableNameArg = "new_table_name";
 const char* const kColumnNameArg = "column_name";
 const char* const kNewColumnNameArg = "new_column_name";
+const char* const kKeyArg = "primary_key";
 
 Status CreateKuduClient(const RunnerContext& context,
                         client::sp::shared_ptr<KuduClient>* client) {
@@ -176,6 +184,107 @@ Status DescribeTable(const RunnerContext& context) {
   return Status::OK();
 }
 
+Status LocateRow(const RunnerContext& context) {
+  client::sp::shared_ptr<KuduClient> client;
+  RETURN_NOT_OK(CreateKuduClient(context, &client));
+
+  const string& table_name = FindOrDie(context.required_args, kTableNameArg);
+  client::sp::shared_ptr<KuduTable> table;
+  RETURN_NOT_OK(client->OpenTable(table_name, &table));
+
+  // Create an equality predicate for each primary key column.
+  const string& row_str = FindOrDie(context.required_args, kKeyArg);
+  JsonReader reader(row_str);
+  RETURN_NOT_OK(reader.Init());
+  vector<const rapidjson::Value*> values;
+  RETURN_NOT_OK(reader.ExtractObjectArray(reader.root(), nullptr, &values));
+
+  const auto& schema = table->schema();
+  vector<int> key_indexes;
+  schema.GetPrimaryKeyColumnIndexes(&key_indexes);
+  if (values.size() != key_indexes.size()) {
+    return Status::InvalidArgument(
+        Substitute("wrong number of key columns specified: expected $0 but received $1",
+                   key_indexes.size(),
+                   values.size()));
+  }
+
+  vector<unique_ptr<KuduPredicate>> predicates;
+  for (int i = 0; i < values.size(); i++) {
+    const auto key_index = key_indexes[i];
+    const auto& column = schema.Column(key_index);
+    const auto& col_name = column.name();
+    const auto type = column.type();
+    switch (type) {
+      case KuduColumnSchema::INT8:
+      case KuduColumnSchema::INT16:
+      case KuduColumnSchema::INT32:
+      case KuduColumnSchema::INT64:
+      case KuduColumnSchema::UNIXTIME_MICROS: {
+        int64_t value;
+        RETURN_NOT_OK_PREPEND(
+            reader.ExtractInt64(values[i], nullptr, &value),
+            Substitute("unable to parse value for column '$0' of type $1",
+                       col_name,
+                       KuduColumnSchema::DataTypeToString(type)));
+        predicates.emplace_back(
+            table->NewComparisonPredicate(col_name,
+                                          client::KuduPredicate::EQUAL,
+                                          client::KuduValue::FromInt(value)));
+        break;
+      }
+      case KuduColumnSchema::BINARY:
+      case KuduColumnSchema::STRING: {
+        string value;
+        RETURN_NOT_OK_PREPEND(
+            reader.ExtractString(values[i], nullptr, &value),
+            Substitute("unable to parse value for column '$0' of type $1",
+                       col_name,
+                       KuduColumnSchema::DataTypeToString(type)));
+        predicates.emplace_back(
+            table->NewComparisonPredicate(col_name,
+                                          client::KuduPredicate::EQUAL,
+                                          client::KuduValue::CopyString(value)));
+        break;
+      }
+      case KuduColumnSchema::DECIMAL:
+        return Status::NotSupported(
+            Substitute("unsupported type $0 for key column '$1': "
+                       "$0 key columns are not supported by this tool",
+                       KuduColumnSchema::DataTypeToString(type),
+                       col_name));
+      default:
+        return Status::NotSupported(
+            Substitute("unsupported type $0 for key column '$1': "
+                       "is this tool out of date?",
+                       KuduColumnSchema::DataTypeToString(type),
+                       col_name));
+    }
+  }
+
+  // Find the tablet by constructing scan tokens for a scan with equality
+  // predicates on all key columns. At most one tablet will match, so there
+  // will be at most one token, and we can report the id of its tablet.
+  vector<KuduScanToken*> tokens;
+  ElementDeleter deleter(&tokens);
+  KuduScanTokenBuilder builder(table.get());
+  for (auto& predicate : predicates) {
+    RETURN_NOT_OK(builder.AddConjunctPredicate(predicate.release()));
+  }
+  RETURN_NOT_OK(builder.Build(&tokens));
+  if (tokens.empty()) {
+    // Must be in a non-covered range partition.
+    return Status::NotFound("row does not belong to any currently existing tablet");
+  }
+  if (tokens.size() > 1) {
+    // This should be impossible.
+    return Status::IllegalState(
+        "all primary key columns specified but more than one matching tablet?");
+  }
+  cout << tokens[0]->tablet().id() << endl;
+  return Status::OK();
+}
+
 Status RenameTable(const RunnerContext& context) {
   const string& table_name = FindOrDie(context.required_args, kTableNameArg);
   const string& new_table_name = FindOrDie(context.required_args, kNewTableNameArg);
@@ -232,6 +341,22 @@ unique_ptr<Mode> BuildTableMode() {
       .AddOptionalParameter("list_tablets")
       .Build();
 
+  unique_ptr<Action> locate_row =
+      ActionBuilder("locate_row", &LocateRow)
+      .Description("Locate which tablet a row belongs to")
+      .ExtraDescription("Provide the primary key as a JSON array of primary "
+                        "key values, e.g. '[1, \"foo\", 2, \"bar\"]'. The "
+                        "output will be the tablet id associated with the row "
+                        "key. If there is no such tablet, an error message "
+                        "will be printed and the command will return a "
+                        "non-zero status")
+      .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
+      .AddRequiredParameter({ kTableNameArg, "Name of the table to look up against" })
+      .AddRequiredParameter({ kKeyArg,
+                              "String representation of the row's primary key "
+                              "as a JSON array" })
+      .Build();
+
   unique_ptr<Action> rename_column =
       ActionBuilder("rename_column", &RenameColumn)
       .Description("Rename a column")
@@ -255,6 +380,7 @@ unique_ptr<Mode> BuildTableMode() {
       .AddAction(std::move(delete_table))
       .AddAction(std::move(describe_table))
       .AddAction(std::move(list_tables))
+      .AddAction(std::move(locate_row))
       .AddAction(std::move(rename_column))
       .AddAction(std::move(rename_table))
       .Build();


Mime
View raw message