kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From t...@apache.org
Subject [7/8] kudu git commit: KUDU-1582. Optimize budgeted compaction policy with an approximation
Date Wed, 07 Sep 2016 00:22:06 GMT
KUDU-1582. Optimize budgeted compaction policy with an approximation

On a server with ~5.5TB of data, I noticed that the maintenance manager
was spending tens of seconds computing optimal compactions, and most of
the time was in knapsack solving.

This patch implements a "first pass" solution to the compaction policy
using an approximation algorithm that is significantly faster than the
actual knapsack solver. The approximation algorithm gives a lower bound
solution which is at least 50% as good as the optimal solution (i.e. it
is a '2-approximation'). In practice, it is often optimal or nearly
optimal.

To test, I dumped the bounds and sizes of rowsets from a 200+GB YCSB
tablet from a real cluster that has been inserting for a few days, and
wrote a test which imported that layout back in as mock rowsets in order
to run the policy.

The result (with the default 5% max approximation error) is between a 6x
and 20x speedup (depending on the budget size). The resulting loss in
solution quality is less than 1%.

Before:
  Time spent Computing compaction with 128MB budget: real 0.965s        user 0.964s     sys
0.000s
   quality=0.118006
  Time spent Computing compaction with 256MB budget: real 1.202s        user 1.204s     sys
0.000s
   quality=0.254289
  Time spent Computing compaction with 512MB budget: real 2.233s        user 2.232s     sys
0.000s
   quality=0.524391
  Time spent Computing compaction with 1024MB budget: real 4.202s       user 4.200s     sys
0.000s
   quality=1.06674

After:
  Time spent Computing compaction with 128MB budget: real 0.148s        user 0.148s     sys
0.000s
   quality=0.117985
  Time spent Computing compaction with 256MB budget: real 0.162s        user 0.160s     sys
0.000s
   quality=0.252118
  Time spent Computing compaction with 512MB budget: real 0.174s        user 0.176s     sys
0.000s
   quality=0.524391
  Time spent Computing compaction with 1024MB budget: real 0.206s       user 0.208s     sys
0.000s
   quality=1.06674

Change-Id: I4e611f161d66ddc47e97e3b5328bc1778a4ac958
Reviewed-on: http://gerrit.cloudera.org:8080/4153
Reviewed-by: David Ribeiro Alves <dralves@apache.org>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/c4e3ff67
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/c4e3ff67
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/c4e3ff67

Branch: refs/heads/master
Commit: c4e3ff67b14954f00570b65861fc9bdbd61e7999
Parents: 2f12d83
Author: Todd Lipcon <todd@apache.org>
Authored: Sun Aug 28 15:03:25 2016 -0700
Committer: Todd Lipcon <todd@apache.org>
Committed: Tue Sep 6 23:17:11 2016 +0000

----------------------------------------------------------------------
 src/kudu/tablet/CMakeLists.txt            |    9 +-
 src/kudu/tablet/compaction_policy-test.cc |   66 +
 src/kudu/tablet/compaction_policy.cc      |  295 +-
 src/kudu/tablet/compaction_policy.h       |   37 +-
 src/kudu/tablet/ycsb-test-rowsets.tsv     | 6836 ++++++++++++++++++++++++
 5 files changed, 7144 insertions(+), 99 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/c4e3ff67/src/kudu/tablet/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/CMakeLists.txt b/src/kudu/tablet/CMakeLists.txt
index c6baa67..402b2c8 100644
--- a/src/kudu/tablet/CMakeLists.txt
+++ b/src/kudu/tablet/CMakeLists.txt
@@ -87,7 +87,14 @@ set(KUDU_TEST_LINK_LIBS tablet ${KUDU_MIN_TEST_LIBS})
 ADD_KUDU_TEST(tablet-test)
 ADD_KUDU_TEST(tablet_metadata-test)
 ADD_KUDU_TEST(mt-tablet-test RUN_SERIAL true)
-ADD_KUDU_TEST(compaction_policy-test)
+
+# Copy data file needed for compaction_policy-test.
+execute_process(COMMAND ln -sf ${CMAKE_CURRENT_SOURCE_DIR}/ycsb-test-rowsets.tsv
+  ${EXECUTABLE_OUTPUT_PATH})
+ADD_KUDU_TEST(compaction_policy-test
+  # Can't use dist-test because it relies on a data file.
+  LABELS no_dist_test)
+
 ADD_KUDU_TEST(diskrowset-test)
 ADD_KUDU_TEST(mt-diskrowset-test RUN_SERIAL true)
 ADD_KUDU_TEST(memrowset-test)

http://git-wip-us.apache.org/repos/asf/kudu/blob/c4e3ff67/src/kudu/tablet/compaction_policy-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/compaction_policy-test.cc b/src/kudu/tablet/compaction_policy-test.cc
index 6a72cf7..9d74c60 100644
--- a/src/kudu/tablet/compaction_policy-test.cc
+++ b/src/kudu/tablet/compaction_policy-test.cc
@@ -15,10 +15,16 @@
 // specific language governing permissions and limitations
 // under the License.
 
+#include <glog/stl_logging.h>
 #include <gtest/gtest.h>
 #include <memory>
 #include <unordered_set>
+#include <string>
 
+#include "kudu/gutil/strings/numbers.h"
+#include "kudu/gutil/strings/split.h"
+#include "kudu/gutil/strings/substitute.h"
+#include "kudu/util/stopwatch.h"
 #include "kudu/util/test_util.h"
 #include "kudu/tablet/mock-rowsets.h"
 #include "kudu/tablet/rowset.h"
@@ -27,6 +33,8 @@
 
 using std::shared_ptr;
 using std::unordered_set;
+using std::string;
+using std::vector;
 
 namespace kudu {
 namespace tablet {
@@ -52,5 +60,63 @@ TEST(TestCompactionPolicy, TestBudgetedSelection) {
   ASSERT_GE(quality, 1.0);
 }
 
+// Return the directory of the currently-running executable.
+static string GetExecutableDir() {
+  string exec;
+  CHECK_OK(Env::Default()->GetExecutablePath(&exec));
+  return DirName(exec);
+}
+
+static RowSetVector LoadFile(const string& name) {
+  RowSetVector ret;
+  string path = JoinPathSegments(GetExecutableDir(), name);
+  faststring data;
+  CHECK_OK_PREPEND(ReadFileToString(Env::Default(), path, &data),
+                   strings::Substitute("Unable to load test data file $0", path));
+  vector<string> lines = strings::Split(data.ToString(), "\n");
+  for (const auto& line : lines) {
+    if (line.empty() || line[0] == '#') continue;
+    vector<string> fields = strings::Split(line, "\t");
+    CHECK_EQ(3, fields.size()) << "Expected 3 fields on line: " << line;
+    int size_mb = ParseLeadingInt32Value(fields[0], -1);
+    CHECK_GE(size_mb, 1) << "Expected size at least 1MB on line: " << line;
+    ret.emplace_back(new MockDiskRowSet(fields[1] /* min key */,
+                                        fields[2] /* max key */,
+                                        size_mb * 1024 * 1024));
+  }
+
+  return ret;
+}
+
+// Realistic test using data scraped from a tablet containing 200+GB of YCSB data.
+// This test can be used as a benchmark for optimizing the compaction policy,
+// and also serves as a basic regression/stress test using real data.
+TEST(TestCompactionPolicy, TestYcsbCompaction) {
+  RowSetVector vec = LoadFile("ycsb-test-rowsets.tsv");;
+  RowSetTree tree;
+  ASSERT_OK(tree.Reset(vec));
+  vector<double> qualities;
+  for (int budget_mb : {128, 256, 512, 1024}) {
+    BudgetedCompactionPolicy policy(budget_mb);
+
+    unordered_set<RowSet*> picked;
+    double quality = 0;
+    LOG_TIMING(INFO, strings::Substitute("Computing compaction with $0MB budget", budget_mb))
{
+      ASSERT_OK(policy.PickRowSets(tree, &picked, &quality, nullptr));
+    }
+    LOG(INFO) << "quality=" << quality;
+    int total_size = 0;
+    for (const auto* rs : picked) {
+      total_size += rs->EstimateOnDiskSize() / 1024 / 1024;
+    }
+    ASSERT_LE(total_size, budget_mb);
+    qualities.push_back(quality);
+  }
+
+  // Given increasing budgets, our solutions should also be higher quality.
+  ASSERT_TRUE(std::is_sorted(qualities.begin(), qualities.end()))
+      << qualities;
+}
+
 } // namespace tablet
 } // namespace kudu

http://git-wip-us.apache.org/repos/asf/kudu/blob/c4e3ff67/src/kudu/tablet/compaction_policy.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/compaction_policy.cc b/src/kudu/tablet/compaction_policy.cc
index f478ac9..3836d29 100644
--- a/src/kudu/tablet/compaction_policy.cc
+++ b/src/kudu/tablet/compaction_policy.cc
@@ -43,6 +43,12 @@ DEFINE_int32(budgeted_compaction_target_rowset_size, 32*1024*1024,
 TAG_FLAG(budgeted_compaction_target_rowset_size, experimental);
 TAG_FLAG(budgeted_compaction_target_rowset_size, advanced);
 
+DEFINE_double(compaction_approximation_ratio, 1.05f,
+              "Approximation ratio allowed for optimal compaction calculation. A "
+              "value of 1.05 indicates that the policy may use an approximate result "
+              "if it is known to be within 5% of the optimal solution.");
+TAG_FLAG(compaction_approximation_ratio, experimental);
+
 namespace kudu {
 namespace tablet {
 
@@ -102,13 +108,13 @@ struct CompareByDescendingDensity {
 };
 
 struct KnapsackTraits {
-  typedef RowSetInfo item_type;
+  typedef const RowSetInfo* item_type;
   typedef double value_type;
-  static int get_weight(const RowSetInfo &item) {
-    return item.size_mb();
+  static int get_weight(const RowSetInfo* item) {
+    return item->size_mb();
   }
-  static value_type get_value(const RowSetInfo &item) {
-    return item.width();
+  static value_type get_value(const RowSetInfo* item) {
+    return item->width();
   }
 };
 
@@ -122,8 +128,8 @@ struct DerefCompare {
   }
 };
 
-// Incremental calculator for the upper bound on a knapsack solution,
-// given a set of items. The upper bound is computed by solving the
+// Incremental calculator for lower and upper bounds on a knapsack solution,
+// given a set of items. The bounds are computed by solving the
 // simpler "fractional knapsack problem" -- i.e the related problem
 // in which each input may be fractionally put in the knapsack, instead
 // of all-or-nothing. The fractional knapsack problem has a very efficient
@@ -131,14 +137,17 @@ struct DerefCompare {
 // until the budget is reached. The last element to be chosen may be
 // partially included in the knapsack.
 //
+// This provides an upper bound (with the last element fractionally included)
+// and a lower bound (the same solution but without the fractional last element).
+//
 // Because this greedy solution only depends on sorting, it can be computed
 // incrementally as items are considered by maintaining a min-heap, ordered
 // by the density of the input elements. We need only maintain enough elements
 // to satisfy the budget, making this logarithmic in the budget and linear
 // in the number of elements added.
-class UpperBoundCalculator {
+class BoundCalculator {
  public:
-  explicit UpperBoundCalculator(int max_weight)
+  explicit BoundCalculator(int max_weight)
     : total_weight_(0),
       total_value_(0),
       max_weight_(max_weight),
@@ -169,18 +178,49 @@ class UpperBoundCalculator {
     topdensity_ = top->density();
   }
 
-  // Compute the upper-bound to the 0-1 knapsack problem with the elements
+  // Compute the lower and upper bounds to the 0-1 knapsack problem with the elements
   // added so far.
-  double ComputeUpperBound() const {
+  pair<double, double> ComputeLowerAndUpperBound() const {
     int excess_weight = total_weight_ - max_weight_;
     if (excess_weight <= 0) {
-      return total_value_;
+      // If we've added less than the budget, our "bounds" are just including
+      // all of the items.
+      return { total_value_, total_value_ };
     }
 
     const RowSetInfo& top = *fractional_solution_.front();
+
+    // The lower bound is either of:
+    // - the highest density N items such that they all fit, OR
+    // - the N+1th item, if it fits
+    // This is a 2-approximation (i.e. no worse than 1/2 of the best solution).
+    // See https://courses.engr.illinois.edu/cs598csc/sp2009/lectures/lecture_4.pdf
+    double lower_bound = std::max(total_value_ - top.width(), top.width());
     double fraction_of_top_to_remove = static_cast<double>(excess_weight) / top.size_mb();
     DCHECK_GT(fraction_of_top_to_remove, 0);
-    return total_value_ - fraction_of_top_to_remove * top.width();
+    double upper_bound = total_value_ - fraction_of_top_to_remove * top.width();
+    return {lower_bound, upper_bound};
+  }
+
+  // Return the items which make up the current lower-bound solution.
+  void GetLowerBoundSolution(vector<const RowSetInfo*>* solution) {
+    solution->clear();
+    int excess_weight = total_weight_ - max_weight_;
+    if (excess_weight <= 0) {
+      *solution = fractional_solution_;
+    } else {
+      const RowSetInfo* top = fractional_solution_.front();
+
+      // See above: there are two choices for the lower-bound estimate,
+      // and we need to return the one matching the bound we computed.
+      if (total_value_ - top->width() > top->width()) {
+        // The current solution less the top (minimum density) element.
+        solution->assign(fractional_solution_.begin() + 1,
+                         fractional_solution_.end());
+      } else {
+        solution->push_back(top);
+      }
+    }
   }
 
   void clear() {
@@ -202,64 +242,71 @@ class UpperBoundCalculator {
 
 } // anonymous namespace
 
-Status BudgetedCompactionPolicy::PickRowSets(const RowSetTree &tree,
-                                             unordered_set<RowSet*>* picked,
-                                             double* quality,
-                                             std::vector<std::string>* log) {
-  vector<RowSetInfo> asc_min_key, asc_max_key;
-  SetupKnapsackInput(tree, &asc_min_key, &asc_max_key);
-  if (asc_max_key.empty()) {
-    if (log) {
-      LOG_STRING(INFO, log) << "No rowsets to compact";
+void BudgetedCompactionPolicy::RunApproximation(
+    const vector<RowSetInfo>& asc_min_key,
+    const vector<RowSetInfo>& asc_max_key,
+    vector<double>* best_upper_bounds,
+    SolutionAndValue* best_solution) {
+  best_upper_bounds->clear();
+  best_upper_bounds->reserve(asc_min_key.size());
+  BoundCalculator bound_calc(size_budget_mb_);
+  for (const RowSetInfo& cc_a : asc_min_key) {
+    bound_calc.clear();
+    double ab_min = cc_a.cdf_min_key();
+    double ab_max = cc_a.cdf_max_key();
+    double best_upper = 0;
+    for (const RowSetInfo& cc_b : asc_max_key) {
+      if (cc_b.cdf_min_key() < ab_min) {
+        continue;
+      }
+      ab_max = std::max(cc_b.cdf_max_key(), ab_max);
+      double union_width = ab_max - ab_min;
+      bound_calc.Add(cc_b);
+      auto bounds = bound_calc.ComputeLowerAndUpperBound();
+      double lower = bounds.first - union_width * kSupportAdjust;
+      double upper = bounds.second - union_width * kSupportAdjust;
+      best_upper = std::max(upper, best_upper);
+      if (lower > best_solution->value) {
+        vector<const RowSetInfo*> approx_solution;
+        bound_calc.GetLowerBoundSolution(&approx_solution);
+        best_solution->rowsets.clear();
+        for (const auto* rsi : approx_solution) {
+          best_solution->rowsets.insert(rsi->rowset());
+        }
+        best_solution->value = lower;
+      }
     }
-    // nothing to compact.
-    return Status::OK();
+    best_upper_bounds->push_back(best_upper);
   }
+}
 
-  UpperBoundCalculator ub_calc(size_budget_mb_);
-  KnapsackSolver<KnapsackTraits> solver;
-
-  // The best set of rowsets chosen so far
-  unordered_set<RowSet *> best_chosen;
-  // The value attained by the 'best_chosen' solution.
-  double best_optimal = 0;
+void BudgetedCompactionPolicy::RunExact(
+    const vector<RowSetInfo>& asc_min_key,
+    const vector<RowSetInfo>& asc_max_key,
+    const vector<double>& best_upper_bounds,
+    SolutionAndValue* best_solution) {
 
-  vector<int> chosen_indexes;
-  vector<RowSetInfo> inrange_candidates;
+  KnapsackSolver<KnapsackTraits> solver;
+  vector<const RowSetInfo*> inrange_candidates;
   inrange_candidates.reserve(asc_min_key.size());
-  vector<double> upper_bounds;
+  for (int i = 0; i < asc_min_key.size(); i++) {
+    const RowSetInfo& cc_a = asc_min_key[i];
+    const double upper_bound = best_upper_bounds[i];
 
-  for (const RowSetInfo& cc_a : asc_min_key) {
-    chosen_indexes.clear();
-    inrange_candidates.clear();
-    ub_calc.clear();
-    upper_bounds.clear();
+    // 'upper_bound' is an upper bound on the solution value of any compaction that includes
+    // 'cc_a' as its left-most RowSet. If that bound is worse than the current best solution,
+    // then we can skip finding the precise value of this solution.
+    //
+    // Here we also build in the approximation ratio as slop: the upper bound doesn't need
+    // to just be better than the current solution, but needs to be better by at least
+    // the approximation ratio before we bother looking for it.
+    if (upper_bound < best_solution->value * FLAGS_compaction_approximation_ratio)
{
+      continue;
+    }
 
+    inrange_candidates.clear();
     double ab_min = cc_a.cdf_min_key();
     double ab_max = cc_a.cdf_max_key();
-
-    // Collect all other candidates which would not expand the support to the
-    // left of this one. Because these are sorted by ascending max key, we can
-    // easily ensure that whenever we add a 'cc_b' to our candidate list for the
-    // knapsack problem, we've already included all rowsets which fall in the
-    // range from cc_a.min to cc_b.max.
-    //
-    // For example:
-    //
-    //  |-----A----|
-    //      |-----B----|
-    //         |----C----|
-    //       |--------D-------|
-    //
-    // We process in the order: A, B, C, D.
-    //
-    // This saves us from having to iterate through the list again to find all
-    // such rowsets.
-    //
-    // Additionally, each knapsack problem builds on the previous knapsack
-    // problem by adding just a single rowset, meaning that we can reuse the
-    // existing dynamic programming state to incrementally update the solution,
-    // rather than having to rebuild from scratch.
     for (const RowSetInfo& cc_b : asc_max_key) {
       if (cc_b.cdf_min_key() < ab_min) {
         // Would expand support to the left.
@@ -267,78 +314,134 @@ Status BudgetedCompactionPolicy::PickRowSets(const RowSetTree &tree,
         // cc_b with cdf_max_key() > cc_a.cdf_min_key()
         continue;
       }
-      inrange_candidates.push_back(cc_b);
-
-      // While we're iterating, also calculate the upper bound for the solution
-      // on the set within the [ab_min, ab_max] output range.
-      ab_max = std::max(cc_b.cdf_max_key(), ab_max);
-      double union_width = ab_max - ab_min;
-
-      ub_calc.Add(cc_b);
-      upper_bounds.push_back(ub_calc.ComputeUpperBound() - union_width * kSupportAdjust);
+      inrange_candidates.push_back(&cc_b);
     }
     if (inrange_candidates.empty()) continue;
-    // If the best upper bound across this whole range is worse than our current
-    // optimal, we can short circuit all the knapsack-solving.
-    if (*std::max_element(upper_bounds.begin(), upper_bounds.end()) < best_optimal) continue;
 
     solver.Reset(size_budget_mb_, &inrange_candidates);
-
     ab_max = cc_a.cdf_max_key();
 
-    int i = 0;
+    vector<int> chosen_indexes;
+    int j = 0;
     while (solver.ProcessNext()) {
-      // If this candidate's upper bound is worse than the optimal, we don't
-      // need to look at it.
-      const RowSetInfo& item = inrange_candidates[i];
-      double upper_bound = upper_bounds[i];
-      i++;
-      if (upper_bound < best_optimal) continue;
-
+      const RowSetInfo* item = inrange_candidates[j++];
       std::pair<int, double> best_with_this_item = solver.GetSolution();
       double best_value = best_with_this_item.second;
 
-      ab_max = std::max(item.cdf_max_key(), ab_max);
+      ab_max = std::max(item->cdf_max_key(), ab_max);
       DCHECK_GE(ab_max, ab_min);
       double solution = best_value - (ab_max - ab_min) * kSupportAdjust;
-      DCHECK_LE(solution, upper_bound + 0.0001);
-
-      if (solution > best_optimal) {
+      if (solution > best_solution->value) {
         solver.TracePath(best_with_this_item, &chosen_indexes);
-        best_optimal = solution;
+        best_solution->value = solution;
       }
     }
 
     // If we came up with a new solution, replace.
     if (!chosen_indexes.empty()) {
-      best_chosen.clear();
-      for (int i : chosen_indexes) {
-        best_chosen.insert(inrange_candidates[i].rowset());
+      best_solution->rowsets.clear();
+      for (int idx : chosen_indexes) {
+        best_solution->rowsets.insert(inrange_candidates[idx]->rowset());
       }
     }
   }
+}
+
+// See docs/design-docs/compaction-policy.md for an overview of the compaction
+// policy implemented in this function.
+Status BudgetedCompactionPolicy::PickRowSets(const RowSetTree &tree,
+                                             unordered_set<RowSet*>* picked,
+                                             double* quality,
+                                             std::vector<std::string>* log) {
+  vector<RowSetInfo> asc_min_key, asc_max_key;
+  SetupKnapsackInput(tree, &asc_min_key, &asc_max_key);
+  if (asc_max_key.empty()) {
+    if (log) {
+      LOG_STRING(INFO, log) << "No rowsets to compact";
+    }
+    // nothing to compact.
+    return Status::OK();
+  }
+
+  // The best set of rowsets chosen so far, and the value attained by that choice.
+  SolutionAndValue best_solution;
+
+  // The algorithm proceeds in two passes. The first is based on an approximation
+  // of the knapsack problem, and computes some upper and lower bounds. The second
+  // pass looks again over the input for any cases where the upper bound tells us
+  // there is a significant improvement potentially available, and only in those
+  // cases runs the actual knapsack solver.
+
+  // Both passes are structured as a loop over all pairs {cc_a, cc_b} such that
+  // cc_a.min_key() < cc_b.min_key(). Given any such pair, we know that a compaction
+  // including both of these pairs would result in an output rowset that spans
+  // the range [cc_a.min_key(), max(cc_a.max_key(), cc_b.max_key())]. This width
+  // is designated 'union_width' below.
+
+  // In particular, the order in which we consider 'cc_b' is such that, for the
+  // inner loop (a fixed 'cc_a'), the 'union_width' increases minimally to only
+  // include the new 'cc_b' and not also cover any other rowsets.
+  //
+  // For example:
+  //
+  //  |-----A----|
+  //      |-----B----|
+  //         |----C----|
+  //       |--------D-------|
+  //
+  // We process in the order: A, B, C, D.
+  //
+  // With this iteration order, each knapsack problem builds on the previous knapsack
+  // problem by adding just a single rowset, meaning that we can reuse the
+  // existing solution state to incrementally update the solution, rather than having
+  // to rebuild from scratch.
+
+
+  // Pass 1 (approximate)
+  // ------------------------------------------------------------
+  // First, run the whole algorithm using a fast approximation of the knapsack solver
+  // to come up with lower bounds and upper bounds. The approximation algorithm is a
+  // 2-approximation but in practice often yields results that are very close to optimal
+  // for the sort of problems we encounter in our use case.
+  //
+  // This pass calculates:
+  // 1) 'best_upper_bounds': for each 'cc_a', what's the upper bound for any solution
+  //     including this rowset. We use this later to avoid solving the knapsack for
+  //     cases where the upper bound is lower than our current best solution.
+  // 2) 'best_solution' and 'best_solution->value': the best approximate solution
+  //     found.
+  vector<double> best_upper_bounds;
+  RunApproximation(asc_min_key, asc_max_key, &best_upper_bounds, &best_solution);
+
+  // Pass 2 (precise)
+  // ------------------------------------------------------------
+  // Now that we've found an approximate solution and upper bounds, do another pass.
+  // In cases where the upper bound indicates we could do substantially better than
+  // our current best solution, we use the exact knapsack solver to find the improved
+  // solution.
+  RunExact(asc_min_key, asc_max_key, best_upper_bounds, &best_solution);
 
   // Log the input and output of the selection.
   if (VLOG_IS_ON(1) || log != nullptr) {
     LOG_STRING(INFO, log) << "Budgeted compaction selection:";
     for (RowSetInfo &cand : asc_min_key) {
       const char *checkbox = "[ ]";
-      if (ContainsKey(best_chosen, cand.rowset())) {
+      if (ContainsKey(best_solution.rowsets, cand.rowset())) {
         checkbox = "[x]";
       }
       LOG_STRING(INFO, log) << "  " << checkbox << " " << cand.ToString();
     }
-    LOG_STRING(INFO, log) << "Solution value: " << best_optimal;
+    LOG_STRING(INFO, log) << "Solution value: " << best_solution.value;
   }
 
-  *quality = best_optimal;
+  *quality = best_solution.value;
 
-  if (best_optimal <= 0) {
+  if (best_solution.value <= 0) {
     VLOG(1) << "Best compaction available makes things worse. Not compacting.";
     return Status::OK();
   }
 
-  picked->swap(best_chosen);
+  picked->swap(best_solution.rowsets);
   DumpCompactionSVG(asc_min_key, *picked);
 
   return Status::OK();

http://git-wip-us.apache.org/repos/asf/kudu/blob/c4e3ff67/src/kudu/tablet/compaction_policy.h
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/compaction_policy.h b/src/kudu/tablet/compaction_policy.h
index 7f660ee..9b7fcb8 100644
--- a/src/kudu/tablet/compaction_policy.h
+++ b/src/kudu/tablet/compaction_policy.h
@@ -86,9 +86,42 @@ class BudgetedCompactionPolicy : public CompactionPolicy {
   virtual uint64_t target_rowset_size() const OVERRIDE;
 
  private:
+  struct SolutionAndValue {
+    std::unordered_set<RowSet*> rowsets;
+    double value = 0;
+  };
+
+  // Sets up the 'asc_min_key' and 'asc_max_key' vectors necessary
+  // for both the approximate and exact solutions below.
   void SetupKnapsackInput(const RowSetTree &tree,
-                          std::vector<RowSetInfo>* min_key,
-                          std::vector<RowSetInfo>* max_key);
+                          std::vector<RowSetInfo>* asc_min_key,
+                          std::vector<RowSetInfo>* asc_max_key);
+
+
+  // Runs the first pass approximate solution for the algorithm.
+  // Stores the best solution found in 'best_solution'.
+  //
+  // Sets best_upper_bounds[i] to the upper bound for any solution containing
+  // asc_min_key[i] as its left-most rowset.
+  void RunApproximation(
+      const std::vector<RowSetInfo>& asc_min_key,
+      const std::vector<RowSetInfo>& asc_max_key,
+      std::vector<double>* best_upper_bounds,
+      SolutionAndValue* best_solution);
+
+
+  // Runs the second pass of the algorithm.
+  //
+  // For each i in asc_min_key, first checks if best_upper_bounds[i] indicates that a
+  // solution containing asc_min_key[i] may be a better solution than the current
+  // 'best_solution' by at least the configured approximation ratio. If so, runs the full
+  // knapsack algorithm to determine the value of that solution and, if it is indeed
+  // better, replaces '*best_solution' with the new best solution.
+  void RunExact(
+      const std::vector<RowSetInfo>& asc_min_key,
+      const std::vector<RowSetInfo>& asc_max_key,
+      const std::vector<double>& best_upper_bounds,
+      SolutionAndValue* best_solution);
 
   size_t size_budget_mb_;
 };


Mime
View raw message