kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ale...@apache.org
Subject [kudu] 02/02: [tools] KUDU-2688 rebalancer should not fix RF=2 violations
Date Tue, 05 Feb 2019 22:41:03 GMT
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 2b9b579bbde4956f8bd4f4e2c3dd08a49b9b8bc0
Author: Alexey Serbin <alexey@apache.org>
AuthorDate: Tue Feb 5 13:52:58 2019 -0800

    [tools] KUDU-2688 rebalancer should not fix RF=2 violations
    
    This patch addresses KUDU-2688.  Prior to this patch, the LA rebalancing
    logic would suggest moves to reinstate the placement policy various
    distributions of RF=2 tablet's replicas.  As a result, the rebalancer
    looped indefinitely trying to implement the suggested replica movements.
    However, RF=2 is a special case: any replica distribution of a tablet
    with RF=2 is a violations of the placement policy constraints and it's
    impossible to fix that, regardless of the number of locations
    in a Kudu cluster.
    
    With this patch, the rebalancer no longer tries to correct non-fixable
    violations of the placement policy for RF=2.
    
    Change-Id: I3afc7f3a228368e8646b46914ec7c31234a6aa85
    Reviewed-on: http://gerrit.cloudera.org:8080/12375
    Reviewed-by: Will Berkeley <wdberkeley@gmail.com>
    Tested-by: Kudu Jenkins
---
 src/kudu/tools/placement_policy_util-test.cc | 45 ++++++++++++++++++++++++++++
 src/kudu/tools/placement_policy_util.cc      | 12 ++++++++
 2 files changed, 57 insertions(+)

diff --git a/src/kudu/tools/placement_policy_util-test.cc b/src/kudu/tools/placement_policy_util-test.cc
index cccd611..6eca2fe 100644
--- a/src/kudu/tools/placement_policy_util-test.cc
+++ b/src/kudu/tools/placement_policy_util-test.cc
@@ -545,6 +545,51 @@ TEST_F(ClusterLocationTest, PlacementPolicyViolationsEvenRFEdgeCases)
{
       {}
     },
     {
+      // Three locations, RF=2.
+      {
+        { "T0", 2, { "t0", "t1", "t2", } },
+      },
+      {
+        { "L0", { "A", "B", } },
+        { "L1", { "C", "D", } },
+        { "L2", { "E", "F", } },
+      },
+      {
+        { "A", { "t0", } }, { "B", { "t0", } },
+        { "C", { "t1", } }, { "D", { "t2", } },
+        { "E", { "t1", } }, { "F", { "t2", } },
+      },
+      {
+        { "t0", "L0", 2 },
+        { "t1", "L2", 1 },
+        { "t2", "L2", 1 },
+      },
+      {}
+    },
+    {
+      // Four locations, RF=2.
+      {
+        { "T0", 2, { "t0", "t1", } },
+      },
+      {
+        { "L0", { "A", } },
+        { "L1", { "B", } },
+        { "L2", { "C", } },
+        { "L3", { "D", } },
+      },
+      {
+        { "A", { "t0", } },
+        { "B", { "t0", } },
+        { "C", { "t1", } },
+        { "D", { "t1", } },
+      },
+      {
+        { "t0", "L1", 1 },
+        { "t1", "L3", 1 },
+      },
+      {}
+    },
+    {
       // Two locations, RF=2 and RF=4.
       {
         { "T0", 2, { "t0", } },
diff --git a/src/kudu/tools/placement_policy_util.cc b/src/kudu/tools/placement_policy_util.cc
index f5ab790..48456f7 100644
--- a/src/kudu/tools/placement_policy_util.cc
+++ b/src/kudu/tools/placement_policy_util.cc
@@ -75,6 +75,18 @@ Status FindBestReplicaToReplace(
   const auto& table_id = FindOrDie(tablets_info.tablet_to_table_id, tablet_id);
   const auto& table_info = FindOrDie(tablets_info.tables_info, table_id);
 
+  // The replication factor of 2 is a special case: any possible replica
+  // distribution is a violation of placement policy, and it's impossible
+  // to fix it regardless of number of locations in the cluster.
+  if (table_info.replication_factor == 2) {
+    return Status::ConfigurationError(Substitute(
+        "tablet $0 (table name '$1'): replica distribution cannot conform "
+        "with the placement policy constraints since its replication "
+        "factor is $2",
+        tablet_id, table_info.name,
+        table_info.replication_factor));
+  }
+
   // There are a few edge cases which are most likely to occur, so let's have
   // a special error message for those. In these cases there are too few
   // locations relative to the replication factor, so it's impossible to find


Mime
View raw message