kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject [1/3] kudu git commit: Inlined dispatch for predicate evaluation
Date Thu, 08 Sep 2016 21:05:44 GMT
Repository: kudu
Updated Branches:
  refs/heads/master fdfcdba21 -> e33bac441


Inlined dispatch for predicate evaluation

In order to evaluate a predicate, the correct comparator must first be
determined. Batched evaluation, which calls
ColumnPredicate::Evaluate(), will make a function call to the correct
comparator for every row. The evaluation itself gets split into
batches, but each row in the batch still makes the function call,
which slows performance. To remediate this, this evaluation has been
templatized so there is only a single function call per batch.

Additionally, when decoder-level evaluation is enabled, rather than
occuring in batches, dispatch occurs for each cell at EvaluateCell,
which leads to poorer performance.  To remediate this, the dispatch
has been templatized in hopes that the dispatching and branching are
inlined.

This figure shows the performance of plain encoding for decoder-level
evaluation without this templating adjustment. The query selects a one
tenth the values out of a plain-encoded column containing 10M strings.
EvaluateCell (the Pushdown bar) in this implementation gets compiled
to a dispatched function call per row, which slows it down.  Evaluate
(the Normal Eval bar) also dispatches once per row.
https://raw.githubusercontent.com/anjuwong/kudu/695cbaa016a8e94f164105d84024ceaac4b62375/docs/images/SELECT_WHERE_EQUAL_without_inlining.png

Compare the above with the plot below, which is the result of the same
query, but with inlined dispatch. The comparator is known statically,
so calls to EvaluateCell will be inlined. Additionally, Evaluate only
dispatches once per batch.
https://raw.githubusercontent.com/anjuwong/kudu/695cbaa016a8e94f164105d84024ceaac4b62375/docs/images/SELECT_WHERE_EQUAL_with_inlining.png

Change-Id: Iccfac9bc899362b442337050795b5ca8c4101268
Reviewed-on: http://gerrit.cloudera.org:8080/4164
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <todd@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/ec80fdb3
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/ec80fdb3
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/ec80fdb3

Branch: refs/heads/master
Commit: ec80fdb37be44d380046a823b5e6d8e2241ec3da
Parents: fdfcdba
Author: Andrew Wong <andrew.wong@cloudera.com>
Authored: Fri Sep 2 15:11:41 2016 -0700
Committer: Todd Lipcon <todd@apache.org>
Committed: Thu Sep 8 20:46:42 2016 +0000

----------------------------------------------------------------------
 src/kudu/cfile/binary_plain_block.cc |  2 +-
 src/kudu/cfile/cfile_reader.cc       |  2 +-
 src/kudu/common/column_predicate.cc  | 62 ++++++++++++-------------------
 src/kudu/common/column_predicate.h   | 32 +++++++++++++++-
 4 files changed, 57 insertions(+), 41 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/ec80fdb3/src/kudu/cfile/binary_plain_block.cc
----------------------------------------------------------------------
diff --git a/src/kudu/cfile/binary_plain_block.cc b/src/kudu/cfile/binary_plain_block.cc
index d1b4129..adef8c6 100644
--- a/src/kudu/cfile/binary_plain_block.cc
+++ b/src/kudu/cfile/binary_plain_block.cc
@@ -306,7 +306,7 @@ Status BinaryPlainBlockDecoder::CopyNextAndEval(size_t* n,
   return HandleBatch(n, dst, [&](size_t i, Slice elem, Slice* out, Arena* out_arena)
{
     if (!sel->TestBit(i)) {
       return;
-    } else if (ctx->pred()->EvaluateCell(static_cast<const void*>(&elem)))
{
+    } else if (ctx->pred()->EvaluateCell<BINARY>(static_cast<const void*>(&elem)))
{
       CHECK(out_arena->RelocateSlice(elem, out));
     } else {
       sel->ClearBit(i);

http://git-wip-us.apache.org/repos/asf/kudu/blob/ec80fdb3/src/kudu/cfile/cfile_reader.cc
----------------------------------------------------------------------
diff --git a/src/kudu/cfile/cfile_reader.cc b/src/kudu/cfile/cfile_reader.cc
index 420596b..67999f5 100644
--- a/src/kudu/cfile/cfile_reader.cc
+++ b/src/kudu/cfile/cfile_reader.cc
@@ -910,7 +910,7 @@ Status CFileIterator::Scan(ColumnMaterializationContext* ctx) {
     codewords_matching_pred_->SetAllFalse();
     for (size_t i = 0; i < nwords; i++) {
       Slice cur_string = dict_decoder_->string_at_index(i);
-      if (ctx->pred()->EvaluateCell(static_cast<const void *>(&cur_string)))
{
+      if (ctx->pred()->EvaluateCell<BINARY>(static_cast<const void *>(&cur_string)))
{
         BitmapSet(codewords_matching_pred_->mutable_bitmap(), i);
       }
     }

http://git-wip-us.apache.org/repos/asf/kudu/blob/ec80fdb3/src/kudu/common/column_predicate.cc
----------------------------------------------------------------------
diff --git a/src/kudu/common/column_predicate.cc b/src/kudu/common/column_predicate.cc
index ab8ef90..5dfedbe 100644
--- a/src/kudu/common/column_predicate.cc
+++ b/src/kudu/common/column_predicate.cc
@@ -265,40 +265,30 @@ void ApplyPredicate(const ColumnBlock& block, SelectionVector* sel,
P p) {
 }
 } // anonymous namespace
 
-void ColumnPredicate::Evaluate(const ColumnBlock& block, SelectionVector* sel) const
{
-  CHECK_NOTNULL(sel);
-  // The type-specific predicate is provided as a function template to
-  // ApplyPredicate in the hope that they are inlined.
-  //
-  // TODO: In the future we can improve this by also providing the type info as a
-  // template, so that the type-specific data comparisons can be inlined.
-  //
-  // Going a step further we could do runtime codegen to inline the
-  // lower/upper/equality bounds.
-
-  // TODO: equality predicates should use the bloomfilter if it's available.
-
+template <DataType PhysicalType>
+void ColumnPredicate::EvaluateForPhysicalType(const ColumnBlock& block,
+                                              SelectionVector* sel) const {
   switch (predicate_type()) {
     case PredicateType::Range: {
       if (lower_ == nullptr) {
         ApplyPredicate(block, sel, [this] (const void* cell) {
-          return column_.type_info()->Compare(cell, this->upper_) < 0;
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->upper_) <
0;
         });
       } else if (upper_ == nullptr) {
         ApplyPredicate(block, sel, [this] (const void* cell) {
-          return column_.type_info()->Compare(cell, this->lower_) >= 0;
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) >=
0;
         });
       } else {
         ApplyPredicate(block, sel, [this] (const void* cell) {
-          return column_.type_info()->Compare(cell, this->upper_) < 0 &&
-                 column_.type_info()->Compare(cell, this->lower_) >= 0;
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->upper_) <
0 &&
+                 DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) >=
0;
         });
       }
       return;
     };
     case PredicateType::Equality: {
       ApplyPredicate(block, sel, [this] (const void* cell) {
-        return column_.type_info()->Compare(cell, this->lower_) == 0;
+        return DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) == 0;
       });
       return;
     };
@@ -318,26 +308,22 @@ void ColumnPredicate::Evaluate(const ColumnBlock& block, SelectionVector*
sel) c
   }
 }
 
-bool ColumnPredicate::EvaluateCell(const void* cell) const {
-  switch (predicate_type()) {
-    case PredicateType::Range: {
-      if (lower_ == nullptr) {
-        return column_.type_info()->Compare(cell, this->upper_) < 0;
-      } else if (upper_ == nullptr) {
-        return column_.type_info()->Compare(cell, this->lower_) >= 0;
-      } else {
-        return column_.type_info()->Compare(cell, this->upper_) < 0 &&
-               column_.type_info()->Compare(cell, this->lower_) >= 0;
-      }
-    };
-    case PredicateType::Equality: {
-      return column_.type_info()->Compare(cell, this->lower_) == 0;
-    };
-    case PredicateType::IsNotNull: {
-      return true;
-    };
-    default:
-      LOG(FATAL) << "unknown predicate type";
+void ColumnPredicate::Evaluate(const ColumnBlock& block, SelectionVector* sel) const
{
+  DCHECK(sel);
+  switch (block.type_info()->physical_type()) {
+    case BOOL: return EvaluateForPhysicalType<BOOL>(block, sel);
+    case INT8: return EvaluateForPhysicalType<INT8>(block, sel);
+    case INT16: return EvaluateForPhysicalType<INT16>(block, sel);
+    case INT32: return EvaluateForPhysicalType<INT32>(block, sel);
+    case INT64: return EvaluateForPhysicalType<INT64>(block, sel);
+    case UINT8: return EvaluateForPhysicalType<UINT8>(block, sel);
+    case UINT16: return EvaluateForPhysicalType<UINT16>(block, sel);
+    case UINT32: return EvaluateForPhysicalType<UINT32>(block, sel);
+    case UINT64: return EvaluateForPhysicalType<UINT64>(block, sel);
+    case FLOAT: return EvaluateForPhysicalType<FLOAT>(block, sel);
+    case DOUBLE: return EvaluateForPhysicalType<DOUBLE>(block, sel);
+    case BINARY: return EvaluateForPhysicalType<BINARY>(block, sel);
+    default: LOG(FATAL) << "unknown physical type: " << block.type_info()->physical_type();
   }
 }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/ec80fdb3/src/kudu/common/column_predicate.h
----------------------------------------------------------------------
diff --git a/src/kudu/common/column_predicate.h b/src/kudu/common/column_predicate.h
index 80b77c8..9a7a880 100644
--- a/src/kudu/common/column_predicate.h
+++ b/src/kudu/common/column_predicate.h
@@ -139,7 +139,31 @@ class ColumnPredicate {
   void Evaluate(const ColumnBlock& block, SelectionVector* sel) const;
 
   // Evaluate the predicate on a single cell.
-  bool EvaluateCell(const void *cell) const;
+  template <DataType PhysicalType>
+  bool EvaluateCell(const void* cell) const {
+    switch (predicate_type()) {
+      case PredicateType::None: {
+        return false;
+      };
+      case PredicateType::Range: {
+        if (lower_ == nullptr) {
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->upper_) <
0;
+        } else if (upper_ == nullptr) {
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) >=
0;
+        } else {
+          return DataTypeTraits<PhysicalType>::Compare(cell, this->upper_) <
0 &&
+                 DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) >=
0;
+        }
+      };
+      case PredicateType::Equality: {
+        return DataTypeTraits<PhysicalType>::Compare(cell, this->lower_) == 0;
+      };
+      case PredicateType::IsNotNull: {
+        return true;
+      }
+    }
+    LOG(FATAL) << "unknown predicate type";
+  }
 
   // Print the predicate for debugging.
   std::string ToString() const;
@@ -190,6 +214,12 @@ class ColumnPredicate {
   // Merge another predicate into this Equality predicate.
   void MergeIntoEquality(const ColumnPredicate& other);
 
+  // Templated evaluation to inline the dispatch of comparator. Templating this
+  // allows dispatch to occur only once per batch.
+  template <DataType PhysicalType>
+  void EvaluateForPhysicalType(const ColumnBlock& block,
+                               SelectionVector* sel) const;
+
   // The type of this predicate.
   PredicateType predicate_type_;
 


Mime
View raw message