impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ab...@apache.org
Subject [3/5] incubator-impala git commit: IMPALA-1430, IMPALA-4108: codegen all builtin aggregate functions
Date Wed, 09 Nov 2016 17:03:22 GMT
IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.
The tests are run with both codegen enabled and disabled. To avoid
flaky tests, we switch the UDF tests to use "unique_database".

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Reviewed-on: http://gerrit.cloudera.org:8080/4655
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/d7246d64
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/d7246d64
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/d7246d64

Branch: refs/heads/master
Commit: d7246d64c777384f29fe0f824ee0036b58e8aa2d
Parents: 6775893
Author: Tim Armstrong <tarmstrong@cloudera.com>
Authored: Wed Sep 28 13:04:07 2016 -0700
Committer: Internal Jenkins <cloudera-hudson@gerrit.cloudera.org>
Committed: Wed Nov 9 03:27:12 2016 +0000

----------------------------------------------------------------------
 be/src/benchmarks/hash-benchmark.cc             |   2 +-
 be/src/codegen/codegen-anyval.cc                |  63 +-
 be/src/codegen/codegen-anyval.h                 |  55 +-
 be/src/codegen/gen_ir_descriptions.py           |   4 +-
 be/src/codegen/llvm-codegen-test.cc             |  16 +-
 be/src/codegen/llvm-codegen.cc                  |  19 +-
 be/src/codegen/llvm-codegen.h                   |  29 +-
 be/src/exec/aggregation-node.cc                 |  43 +-
 be/src/exec/exec-node.cc                        |   2 +-
 be/src/exec/hash-join-node.cc                   |  12 +-
 be/src/exec/hash-table.cc                       |  42 +-
 be/src/exec/hdfs-avro-scanner.cc                |  23 +-
 be/src/exec/hdfs-scanner.cc                     |  15 +-
 be/src/exec/old-hash-table.cc                   |  10 +-
 be/src/exec/partitioned-aggregation-node-ir.cc  |   2 +-
 be/src/exec/partitioned-aggregation-node.cc     | 611 ++++++++++---------
 be/src/exec/partitioned-aggregation-node.h      |  25 +-
 be/src/exec/partitioned-hash-join-node.cc       |  12 +-
 be/src/exec/text-converter.cc                   |  12 +-
 be/src/exprs/aggregate-functions-ir.cc          |  24 +-
 be/src/exprs/case-expr.cc                       |   2 +-
 be/src/exprs/compound-predicates.cc             |   7 +-
 be/src/exprs/expr-codegen-test.cc               |   2 +-
 be/src/exprs/expr.cc                            |   4 +-
 be/src/exprs/literal.cc                         |   4 +-
 be/src/exprs/null-literal.cc                    |   2 +-
 be/src/exprs/scalar-fn-call.cc                  |  12 +-
 be/src/exprs/slot-ref.cc                        |  48 +-
 be/src/runtime/descriptors.cc                   | 132 ++--
 be/src/runtime/descriptors.h                    |  40 +-
 be/src/runtime/tuple.cc                         |   5 +-
 be/src/runtime/types.h                          |  19 +-
 be/src/testutil/test-udas.cc                    |  36 ++
 be/src/testutil/test-udfs.cc                    |   6 +-
 be/src/util/tuple-row-compare.cc                |   2 +-
 .../queries/QueryTest/java-udf.test             | 104 ++--
 .../QueryTest/libs_with_same_filenames.test     |  11 +-
 .../queries/QueryTest/load-java-udfs.test       |  79 +--
 .../queries/QueryTest/uda-mem-limit.test        |   3 -
 .../functional-query/queries/QueryTest/uda.test |  28 +
 .../queries/QueryTest/udf-codegen-required.test |  10 +
 .../queries/QueryTest/udf-errors.test           |  37 +-
 .../queries/QueryTest/udf-mem-limit.test        |   5 -
 .../functional-query/queries/QueryTest/udf.test |   7 -
 tests/common/test_result_verifier.py            |  22 +
 tests/query_test/test_aggregation.py            | 124 +++-
 tests/query_test/test_udfs.py                   |  83 ++-
 47 files changed, 1046 insertions(+), 809 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/benchmarks/hash-benchmark.cc
----------------------------------------------------------------------
diff --git a/be/src/benchmarks/hash-benchmark.cc b/be/src/benchmarks/hash-benchmark.cc
index 5ab7f5c..4f6d7f1 100644
--- a/be/src/benchmarks/hash-benchmark.cc
+++ b/be/src/benchmarks/hash-benchmark.cc
@@ -343,7 +343,7 @@ Function* CodegenCrcHash(LlvmCodeGen* codegen, bool mixed) {
   prototype.AddArgument(
       LlvmCodeGen::NamedVariable("results", codegen->GetPtrType(TYPE_INT)));
 
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
   Value* args[3];
   Function* fn = prototype.GeneratePrototype(&builder, &args[0]);
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/codegen-anyval.cc
----------------------------------------------------------------------
diff --git a/be/src/codegen/codegen-anyval.cc b/be/src/codegen/codegen-anyval.cc
index a1ca6d9..bd27409 100644
--- a/be/src/codegen/codegen-anyval.cc
+++ b/be/src/codegen/codegen-anyval.cc
@@ -120,8 +120,7 @@ Type* CodegenAnyVal::GetUnloweredPtrType(LlvmCodeGen* cg, const ColumnType& type
   return GetUnloweredType(cg, type)->getPointerTo();
 }
 
-Value* CodegenAnyVal::CreateCall(
-    LlvmCodeGen* cg, LlvmCodeGen::LlvmBuilder* builder, Function* fn,
+Value* CodegenAnyVal::CreateCall(LlvmCodeGen* cg, LlvmBuilder* builder, Function* fn,
     ArrayRef<Value*> args, const char* name, Value* result_ptr) {
   if (fn->getReturnType()->isVoidTy()) {
     // Void return type indicates that this function returns a DecimalVal via the first
@@ -152,20 +151,15 @@ Value* CodegenAnyVal::CreateCall(
   }
 }
 
-CodegenAnyVal CodegenAnyVal::CreateCallWrapped(
-    LlvmCodeGen* cg, LlvmCodeGen::LlvmBuilder* builder, const ColumnType& type,
-    Function* fn, ArrayRef<Value*> args, const char* name) {
+CodegenAnyVal CodegenAnyVal::CreateCallWrapped(LlvmCodeGen* cg, LlvmBuilder* builder,
+    const ColumnType& type, Function* fn, ArrayRef<Value*> args, const char* name) {
   Value* v = CreateCall(cg, builder, fn, args, name);
   return CodegenAnyVal(cg, builder, type, v, name);
 }
 
-CodegenAnyVal::CodegenAnyVal(LlvmCodeGen* codegen, LlvmCodeGen::LlvmBuilder* builder,
-                             const ColumnType& type, Value* value, const char* name)
-  : type_(type),
-    value_(value),
-    name_(name),
-    codegen_(codegen),
-    builder_(builder) {
+CodegenAnyVal::CodegenAnyVal(LlvmCodeGen* codegen, LlvmBuilder* builder,
+    const ColumnType& type, Value* value, const char* name)
+  : type_(type), value_(value), name_(name), codegen_(codegen), builder_(builder) {
   Type* value_type = GetLoweredType(codegen, type);
   if (value_ == NULL) {
     // No Value* was specified, so allocate one on the stack and load it.
@@ -175,7 +169,7 @@ CodegenAnyVal::CodegenAnyVal(LlvmCodeGen* codegen, LlvmCodeGen::LlvmBuilder* bui
   DCHECK_EQ(value_->getType(), value_type);
 }
 
-Value* CodegenAnyVal::GetIsNull(const char* name) {
+Value* CodegenAnyVal::GetIsNull(const char* name) const {
   switch (type_.type) {
     case TYPE_BIGINT:
     case TYPE_DOUBLE: {
@@ -448,23 +442,28 @@ void CodegenAnyVal::SetDate(Value* date) {
   value_ = builder_->CreateInsertValue(value_, v, 0, name_);
 }
 
-Value* CodegenAnyVal::GetUnloweredPtr() {
-  Value* value_ptr = codegen_->CreateEntryBlockAlloca(*builder_, value_->getType());
-  builder_->CreateStore(value_, value_ptr);
-  return builder_->CreateBitCast(value_ptr, GetUnloweredPtrType(codegen_, type_));
+Value* CodegenAnyVal::GetLoweredPtr(const string& name) const {
+  Value* lowered_ptr =
+      codegen_->CreateEntryBlockAlloca(*builder_, value_->getType(), name.c_str());
+  builder_->CreateStore(GetLoweredValue(), lowered_ptr);
+  return lowered_ptr;
 }
 
-void CodegenAnyVal::SetFromRawPtr(Value* raw_ptr) {
-  Value* val_ptr =
-      builder_->CreateBitCast(raw_ptr, codegen_->GetPtrType(type_), "val_ptr");
-  Value* val = builder_->CreateLoad(val_ptr);
-  SetFromRawValue(val);
+Value* CodegenAnyVal::GetUnloweredPtr(const string& name) const {
+  // Get an unlowered pointer by creating a lowered pointer then bitcasting it.
+  // TODO: if the original value was unlowered, this generates roundabout code that
+  // lowers the value and casts it back. Generally LLVM's optimiser can reason
+  // about what's going on and undo our shenanigans to generate sane code, but it
+  // would be nice to just emit reasonable code in the first place.
+  return builder_->CreateBitCast(
+      GetLoweredPtr(), GetUnloweredPtrType(codegen_, type_), name);
 }
 
 void CodegenAnyVal::SetFromRawValue(Value* raw_val) {
   DCHECK_EQ(raw_val->getType(), codegen_->GetType(type_))
-      << endl << LlvmCodeGen::Print(raw_val)
-      << endl << type_ << " => " << LlvmCodeGen::Print(codegen_->GetType(type_));
+      << endl
+      << LlvmCodeGen::Print(raw_val) << endl
+      << type_ << " => " << LlvmCodeGen::Print(codegen_->GetType(type_));
   switch (type_.type) {
     case TYPE_STRING:
     case TYPE_VARCHAR: {
@@ -614,9 +613,7 @@ void CodegenAnyVal::WriteToSlot(const SlotDescriptor& slot_desc, Value* tuple,
 
   // Null block: set null bit
   builder_->SetInsertPoint(null_block);
-  Function* set_null_fn = slot_desc.GetUpdateNullFn(codegen_, true);
-  DCHECK(set_null_fn != NULL);
-  builder_->CreateCall(set_null_fn, tuple);
+  slot_desc.CodegenSetNullIndicator(codegen_, builder_, tuple, codegen_->true_value());
   builder_->CreateBr(insert_before);
 
   // Leave builder_ after conditional blocks
@@ -704,6 +701,14 @@ Value* CodegenAnyVal::Compare(CodegenAnyVal* other, const char* name) {
   return builder_->CreateCall(compare_fn, args, name);
 }
 
+void CodegenAnyVal::CodegenBranchIfNull(LlvmBuilder* builder, BasicBlock* null_block) {
+  Value* is_null = GetIsNull();
+  BasicBlock* not_null_block = BasicBlock::Create(
+      codegen_->context(), "not_null", builder->GetInsertBlock()->getParent());
+  builder->CreateCondBr(is_null, null_block, not_null_block);
+  builder->SetInsertPoint(not_null_block);
+}
+
 Value* CodegenAnyVal::GetHighBits(int num_bits, Value* v, const char* name) {
   DCHECK_EQ(v->getType()->getIntegerBitWidth(), num_bits * 2);
   Value* shifted = builder_->CreateAShr(v, num_bits);
@@ -762,8 +767,8 @@ Value* CodegenAnyVal::GetNullVal(LlvmCodeGen* codegen, Type* val_type) {
   return ConstantInt::get(val_type, 1);
 }
 
-CodegenAnyVal CodegenAnyVal::GetNonNullVal(LlvmCodeGen* codegen,
-    LlvmCodeGen::LlvmBuilder* builder, const ColumnType& type, const char* name) {
+CodegenAnyVal CodegenAnyVal::GetNonNullVal(LlvmCodeGen* codegen, LlvmBuilder* builder,
+    const ColumnType& type, const char* name) {
   Type* val_type = GetLoweredType(codegen, type);
   // All zeros => 'is_null' = false
   Value* value = Constant::getNullValue(val_type);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/codegen-anyval.h
----------------------------------------------------------------------
diff --git a/be/src/codegen/codegen-anyval.h b/be/src/codegen/codegen-anyval.h
index 290ca03..c07f3eb 100644
--- a/be/src/codegen/codegen-anyval.h
+++ b/be/src/codegen/codegen-anyval.h
@@ -51,6 +51,8 @@ namespace impala {
 /// TYPE_DOUBLE/DoubleVal: { i8, double }
 /// TYPE_STRING/StringVal: { i64, i8* }
 /// TYPE_TIMESTAMP/TimestampVal: { i64, i64 }
+/// TYPE_DECIMAL/DecimalVal (isn't lowered):
+/// %"struct.impala_udf::DecimalVal" { {i8}, [15 x i8], {i128} }
 //
 /// TODO:
 /// - unit tests
@@ -78,14 +80,14 @@ class CodegenAnyVal {
   /// result will not be a pointer in this case).
   //
   /// 'name' optionally specifies the name of the returned value.
-  static llvm::Value* CreateCall(LlvmCodeGen* cg, LlvmCodeGen::LlvmBuilder* builder,
+  static llvm::Value* CreateCall(LlvmCodeGen* cg, LlvmBuilder* builder,
       llvm::Function* fn, llvm::ArrayRef<llvm::Value*> args, const char* name = "",
       llvm::Value* result_ptr = NULL);
 
   /// Same as above but wraps the result in a CodegenAnyVal.
-  static CodegenAnyVal CreateCallWrapped(LlvmCodeGen* cg,
-      LlvmCodeGen::LlvmBuilder* builder, const ColumnType& type, llvm::Function* fn,
-      llvm::ArrayRef<llvm::Value*> args, const char* name = "");
+  static CodegenAnyVal CreateCallWrapped(LlvmCodeGen* cg, LlvmBuilder* builder,
+      const ColumnType& type, llvm::Function* fn, llvm::ArrayRef<llvm::Value*> args,
+      const char* name = "");
 
   /// Returns the lowered AnyVal type associated with 'type'.
   /// E.g.: TYPE_BOOLEAN (which corresponds to a BooleanVal) => i16
@@ -116,8 +118,8 @@ class CodegenAnyVal {
   /// E.g.: TYPE_DOUBLE (lowered DoubleVal: { i8, double }) => { 0, 0 }
   /// This returns a CodegenAnyVal, rather than the unwrapped Value*, because the actual
   /// value still needs to be set.
-  static CodegenAnyVal GetNonNullVal(LlvmCodeGen* codegen,
-      LlvmCodeGen::LlvmBuilder* builder, const ColumnType& type, const char* name = "");
+  static CodegenAnyVal GetNonNullVal(LlvmCodeGen* codegen, LlvmBuilder* builder,
+      const ColumnType& type, const char* name = "");
 
   /// Creates a wrapper around a lowered *Val value.
   //
@@ -132,14 +134,14 @@ class CodegenAnyVal {
   /// must be of the correct lowered type.
   //
   /// If 'name' is specified, it will be used when generated instructions that set value_.
-  CodegenAnyVal(LlvmCodeGen* codegen, LlvmCodeGen::LlvmBuilder* builder,
-                const ColumnType& type, llvm::Value* value = NULL, const char* name = "");
+  CodegenAnyVal(LlvmCodeGen* codegen, LlvmBuilder* builder, const ColumnType& type,
+      llvm::Value* value = NULL, const char* name = "");
 
   /// Returns the current type-lowered value.
-  llvm::Value* value() { return value_; }
+  llvm::Value* GetLoweredValue() const { return value_; }
 
   /// Gets the 'is_null' field of the *Val.
-  llvm::Value* GetIsNull(const char* name = "is_null");
+  llvm::Value* GetIsNull(const char* name = "is_null") const;
 
   /// Get the 'val' field of the *Val. Do not call if this represents a StringVal or
   /// TimestampVal. If this represents a DecimalVal, returns 'val4', 'val8', or 'val16'
@@ -180,19 +182,18 @@ class CodegenAnyVal {
   void SetDate(llvm::Value* date);
   void SetTimeOfDay(llvm::Value* time_of_day);
 
-  /// Allocas and stores this value in an unlowered pointer, and returns the pointer. This
-  /// *Val should be non-null.
-  llvm::Value* GetUnloweredPtr();
+  /// Stores this value in an alloca allocation, and returns the pointer, which has the
+  /// lowered type. This *Val should be non-null. The output variable is called 'name'.
+  llvm::Value* GetLoweredPtr(const std::string& name = "") const;
+
+  /// Stores this value in an alloca allocation, and returns the pointer, which has the
+  /// unlowered type. This *Val should be non-null. The output variable is called 'name'.
+  llvm::Value* GetUnloweredPtr(const std::string& name = "") const;
 
   /// Set this *Val's value based on 'raw_val'. 'raw_val' should be a native type,
   /// StringValue, or TimestampValue.
   void SetFromRawValue(llvm::Value* raw_val);
 
-  /// Set this *Val's value based on void* 'raw_ptr'. 'raw_ptr' should be a pointer to a
-  /// native type, StringValue, or TimestampValue (i.e. the value returned by an
-  /// interpreted compute fn).
-  void SetFromRawPtr(llvm::Value* raw_ptr);
-
   /// Converts this *Val's value to a native type, StringValue, TimestampValue, etc.
   /// This should only be used if this *Val is not null.
   ///
@@ -235,9 +236,23 @@ class CodegenAnyVal {
   /// this < 'other', 0 if this == 'other', > 0 if this > 'other'.
   llvm::Value* Compare(CodegenAnyVal* other, const char* name = "result");
 
+  /// Generate code to branch to 'null_block' if this value is NULL. The branch terminates
+  /// the current BasicBlock, so a new BasicBlock for the non-NULL case is also created,
+  /// and builder's insert position is set to the start of the non-NULL block.
+  ///
+  /// This corresponds to the C++ code:
+  /// if (val.is_null) goto null_block;
+  ///
+  /// non_null_block:
+  ///   <-- Builder insert position after this function returns.
+  /// ...
+  /// null_block:
+  /// ...
+  void CodegenBranchIfNull(LlvmBuilder* builder, llvm::BasicBlock* null_block);
+
   /// Ctor for created an uninitialized CodegenAnYVal that can be assigned to later.
   CodegenAnyVal()
-    : type_(INVALID_TYPE), value_(NULL), name_(NULL), codegen_(NULL), builder_(NULL) { }
+    : type_(INVALID_TYPE), value_(NULL), name_(NULL), codegen_(NULL), builder_(NULL) {}
 
  private:
   ColumnType type_;
@@ -245,7 +260,7 @@ class CodegenAnyVal {
   const char* name_;
 
   LlvmCodeGen* codegen_;
-  LlvmCodeGen::LlvmBuilder* builder_;
+  LlvmBuilder* builder_;
 
   /// Helper function for getting the top (most significant) half of 'v'.
   /// 'v' should have width = 'num_bits' * 2 and be an integer type.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/gen_ir_descriptions.py
----------------------------------------------------------------------
diff --git a/be/src/codegen/gen_ir_descriptions.py b/be/src/codegen/gen_ir_descriptions.py
index ddcd1db..09f3a20 100755
--- a/be/src/codegen/gen_ir_descriptions.py
+++ b/be/src/codegen/gen_ir_descriptions.py
@@ -59,8 +59,8 @@ ir_functions = [
    "_ZN6impala26PartitionedAggregationNode22ProcessBatchNoGroupingEPNS_8RowBatchE"],
   ["PART_AGG_NODE_PROCESS_BATCH_STREAMING",
    "_ZN6impala26PartitionedAggregationNode21ProcessBatchStreamingEbNS_13TPrefetchMode4typeEPNS_8RowBatchES4_PNS_12HashTableCtxEPi"],
-  ["PART_AGG_NODE_GET_EXPR_CTX",
-   "_ZNK6impala26PartitionedAggregationNode17GetAggExprContextEi"],
+  ["PART_AGG_NODE_GET_EXPR_CTXS",
+   "_ZNK6impala26PartitionedAggregationNode18GetAggExprContextsEi"],
   ["AVG_UPDATE_BIGINT",
    "_ZN6impala18AggregateFunctions9AvgUpdateIN10impala_udf9BigIntValEEEvPNS2_15FunctionContextERKT_PNS2_9StringValE"],
   ["AVG_UPDATE_DOUBLE",

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/llvm-codegen-test.cc
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen-test.cc b/be/src/codegen/llvm-codegen-test.cc
index e6f2f79..178baa2 100644
--- a/be/src/codegen/llvm-codegen-test.cc
+++ b/be/src/codegen/llvm-codegen-test.cc
@@ -124,7 +124,7 @@ TEST_F(LlvmCodeGenTest, BadIRFile) {
 // The random int in there is the address of jitted_counter
 Function* CodegenInnerLoop(LlvmCodeGen* codegen, int64_t* jitted_counter, int delta) {
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
 
   LlvmCodeGen::FnPrototype fn_prototype(codegen, "JittedInnerLoop", codegen->void_type());
   Function* jitted_loop_call = fn_prototype.GeneratePrototype();
@@ -173,10 +173,10 @@ TEST_F(LlvmCodeGenTest, ReplaceFnCall) {
   ASSERT_OK(LlvmCodeGenTest::CreateFromFile(&pool, module_file.c_str(), &codegen));
   EXPECT_TRUE(codegen.get() != NULL);
 
-  Function* loop_call = codegen->GetFunction(loop_call_name);
+  Function* loop_call = codegen->GetFunction(loop_call_name, false);
   EXPECT_TRUE(loop_call != NULL);
   EXPECT_TRUE(loop_call->arg_empty());
-  Function* loop = codegen->GetFunction(loop_name);
+  Function* loop = codegen->GetFunction(loop_name, false);
   EXPECT_TRUE(loop != NULL);
   EXPECT_EQ(loop->arg_size(), 1);
 
@@ -261,7 +261,7 @@ Function* CodegenStringTest(LlvmCodeGen* codegen) {
 
   LlvmCodeGen::FnPrototype prototype(codegen, "StringTest", codegen->GetType(TYPE_INT));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("str", string_val_ptr_type));
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
 
   Value* str;
   Function* interop_fn = prototype.GeneratePrototype(&builder, &str);
@@ -340,7 +340,7 @@ TEST_F(LlvmCodeGenTest, MemcpyTest) {
   prototype.AddArgument(LlvmCodeGen::NamedVariable("src", codegen->ptr_type()));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("n", codegen->GetType(TYPE_INT)));
 
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
 
   char src[] = "abcd";
   char dst[] = "aaaa";
@@ -396,9 +396,9 @@ TEST_F(LlvmCodeGenTest, HashTest) {
 
     // Create a codegen'd function that hashes all the types and returns the results.
     // The tuple/values to hash are baked into the codegen for simplicity.
-    LlvmCodeGen::FnPrototype prototype(codegen.get(), "HashTest",
-        codegen->GetType(TYPE_INT));
-    LlvmCodeGen::LlvmBuilder builder(codegen->context());
+    LlvmCodeGen::FnPrototype prototype(
+        codegen.get(), "HashTest", codegen->GetType(TYPE_INT));
+    LlvmBuilder builder(codegen->context());
 
     // Test both byte-size specific hash functions and the generic loop hash function
     Function* fn_fixed = prototype.GeneratePrototype(&builder, NULL);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/llvm-codegen.cc
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen.cc b/be/src/codegen/llvm-codegen.cc
index f629d35..f6e7e97 100644
--- a/be/src/codegen/llvm-codegen.cc
+++ b/be/src/codegen/llvm-codegen.cc
@@ -671,7 +671,7 @@ Status LlvmCodeGen::MaterializeFunction(Function *fn) {
   return MaterializeFunctionHelper(fn);
 }
 
-Function* LlvmCodeGen::GetFunction(const string& symbol) {
+Function* LlvmCodeGen::GetFunction(const string& symbol, bool clone) {
   Function* fn = module_->getFunction(symbol.c_str());
   if (fn == NULL) {
     LOG(ERROR) << "Unable to locate function " << symbol;
@@ -679,6 +679,7 @@ Function* LlvmCodeGen::GetFunction(const string& symbol) {
   }
   Status status = MaterializeFunction(fn);
   if (UNLIKELY(!status.ok())) return NULL;
+  if (clone) return CloneFunction(fn);
   return fn;
 }
 
@@ -1335,15 +1336,15 @@ Function* LlvmCodeGen::GetHashFunction(int num_bytes) {
       Value* ptr = builder.CreateBitCast(data, GetPtrType(TYPE_BIGINT));
       int i = 0;
       while (num_bytes >= 8) {
-        Value* index[] = { GetIntConstant(TYPE_INT, i++) };
-        Value* d = builder.CreateLoad(builder.CreateGEP(ptr, index));
+        Value* index[] = {GetIntConstant(TYPE_INT, i++)};
+        Value* d = builder.CreateLoad(builder.CreateInBoundsGEP(ptr, index));
         result_64 = builder.CreateCall(crc64_fn, ArrayRef<Value*>({result_64, d}));
         num_bytes -= 8;
       }
       result = builder.CreateTrunc(result_64, GetType(TYPE_INT));
-      Value* index[] = { GetIntConstant(TYPE_INT, i * 8) };
+      Value* index[] = {GetIntConstant(TYPE_INT, i * 8)};
       // Update data to past the 8-byte chunks
-      data = builder.CreateGEP(data, index);
+      data = builder.CreateInBoundsGEP(data, index);
     }
 
     if (num_bytes >= 4) {
@@ -1351,8 +1352,8 @@ Function* LlvmCodeGen::GetHashFunction(int num_bytes) {
       Value* ptr = builder.CreateBitCast(data, GetPtrType(TYPE_INT));
       Value* d = builder.CreateLoad(ptr);
       result = builder.CreateCall(crc32_fn, ArrayRef<Value*>({result, d}));
-      Value* index[] = { GetIntConstant(TYPE_INT, 4) };
-      data = builder.CreateGEP(data, index);
+      Value* index[] = {GetIntConstant(TYPE_INT, 4)};
+      data = builder.CreateInBoundsGEP(data, index);
       num_bytes -= 4;
     }
 
@@ -1361,8 +1362,8 @@ Function* LlvmCodeGen::GetHashFunction(int num_bytes) {
       Value* ptr = builder.CreateBitCast(data, GetPtrType(TYPE_SMALLINT));
       Value* d = builder.CreateLoad(ptr);
       result = builder.CreateCall(crc16_fn, ArrayRef<Value*>({result, d}));
-      Value* index[] = { GetIntConstant(TYPE_INT, 2) };
-      data = builder.CreateGEP(data, index);
+      Value* index[] = {GetIntConstant(TYPE_INT, 2)};
+      data = builder.CreateInBoundsGEP(data, index);
       num_bytes -= 2;
     }
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/codegen/llvm-codegen.h
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen.h b/be/src/codegen/llvm-codegen.h
index 095f939..21f9f71 100644
--- a/be/src/codegen/llvm-codegen.h
+++ b/be/src/codegen/llvm-codegen.h
@@ -71,11 +71,15 @@ namespace llvm {
 
 namespace impala {
 
-class CodegenAnyVal;
 class CodegenSymbolEmitter;
 class SubExprElimination;
 class TupleDescriptor;
 
+/// Define builder subclass in case we want to change the template arguments later
+class LlvmBuilder : public llvm::IRBuilder<> {
+  using llvm::IRBuilder<>::IRBuilder;
+};
+
 /// LLVM code generator.  This is the top level object to generate jitted code.
 //
 /// LLVM provides a c++ IR builder interface so IR does not need to be written
@@ -157,9 +161,6 @@ class LlvmCodeGen {
   /// If false, only output IR for functions which were generated.
   std::string GetIR(bool full_module) const;
 
-  /// Typedef builder in case we want to change the template arguments later
-  typedef llvm::IRBuilder<> LlvmBuilder;
-
   /// Utility struct that wraps a variable name and llvm type.
   struct NamedVariable {
     std::string name;
@@ -330,18 +331,22 @@ class LlvmCodeGen {
   /// recursively. Returns NULL if there is any error.
   ///
   /// If 'clone' is true, a clone of the function will be returned. Clones should be used
-  /// iff the caller will modify the returned function. This avoids clobbering the
-  /// function in case other users need it, but we don't clone if we can avoid it to
-  /// reduce compilation time.
+  /// iff the caller will modify the returned function so that the original unmodified
+  /// function remains available. Avoid cloning if possible to reduce compilation time.
   ///
   /// TODO: Return Status instead.
   llvm::Function* GetFunction(IRFunction::Type ir_type, bool clone);
 
   /// Return the function with the symbol name 'symbol' from the module. The returned
-  /// function and its callee will be recursively materialized. The returned function
-  /// isn't cloned. Returns NULL if there is any error.
+  /// function and its callee will be recursively materialized. Returns NULL if there is
+  /// any error.
+  ///
+  /// If 'clone' is true, a clone of the function will be returned. Clones should be used
+  /// iff the caller will modify the returned function so that the original unmodified
+  /// function remains available. Avoid cloning if possible to reduce compilation time.
+  ///
   /// TODO: Return Status instead.
-  llvm::Function* GetFunction(const string& symbol);
+  llvm::Function* GetFunction(const string& symbol, bool clone);
 
   /// Returns the hash function with signature:
   ///   int32_t Hash(int8_t* data, int len, int32_t seed);
@@ -442,8 +447,8 @@ class LlvmCodeGen {
   /// Codegens IR to load array[idx] and returns the loaded value. 'array' should be a
   /// C-style array (e.g. i32*) or an IR array (e.g. [10 x i32]). This function does not
   /// do bounds checking.
-  llvm::Value* CodegenArrayAt(LlvmBuilder*, llvm::Value* array, int idx,
-      const char* name = "");
+  llvm::Value* CodegenArrayAt(
+      LlvmBuilder*, llvm::Value* array, int idx, const char* name = "");
 
   /// Loads a module at 'file' and links it to the module associated with
   /// this LlvmCodeGen object. The module must be on the local filesystem.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/aggregation-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/aggregation-node.cc b/be/src/exec/aggregation-node.cc
index 67928ca..0607aa3 100644
--- a/be/src/exec/aggregation-node.cc
+++ b/be/src/exec/aggregation-node.cc
@@ -550,6 +550,10 @@ llvm::Function* AggregationNode::CodegenUpdateSlot(LlvmCodeGen* codegen,
       codegen->GetPtrType(FunctionContextImpl::LLVM_FUNCTIONCONTEXT_NAME);
   PointerType* expr_ctx_ptr_type = codegen->GetPtrType(ExprContext::LLVM_CLASS_NAME);
   StructType* tuple_struct = intermediate_tuple_desc_->GetLlvmStruct(codegen);
+  if (tuple_struct == NULL) {
+    VLOG_QUERY << "Could not codegen UpdateSlot(): could not generate tuple struct.";
+    return NULL;
+  }
   PointerType* tuple_ptr_type = codegen->GetPtrType(tuple_struct);
   PointerType* tuple_row_ptr_type = codegen->GetPtrType(TupleRow::LLVM_CLASS_NAME);
 
@@ -560,7 +564,7 @@ llvm::Function* AggregationNode::CodegenUpdateSlot(LlvmCodeGen* codegen,
   prototype.AddArgument(LlvmCodeGen::NamedVariable("agg_tuple", tuple_ptr_type));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("row", tuple_row_ptr_type));
 
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
   Value* args[4];
   Function* fn = prototype.GeneratePrototype(&builder, &args[0]);
   Value* fn_ctx_arg = args[0];
@@ -588,8 +592,8 @@ llvm::Function* AggregationNode::CodegenUpdateSlot(LlvmCodeGen* codegen,
 
   if (slot_desc->is_nullable()) {
     // Dst is NULL, just update dst slot to src slot and clear null bit
-    Function* clear_null_fn = slot_desc->GetUpdateNullFn(codegen, false);
-    builder.CreateCall(clear_null_fn, ArrayRef<Value*>({agg_tuple_arg}));
+    slot_desc->CodegenSetNullIndicator(
+        codegen, &builder, agg_tuple_arg, codegen->false_value());
   }
 
   // Update the slot
@@ -630,32 +634,23 @@ llvm::Function* AggregationNode::CodegenUpdateSlot(LlvmCodeGen* codegen,
 
       // Create pointer to src_anyval to pass to HllUpdate() function. We must use the
       // unlowered type.
-      Value* src_lowered_ptr = codegen->CreateEntryBlockAlloca(
-          fn, LlvmCodeGen::NamedVariable("src_lowered_ptr", src.value()->getType()));
-      builder.CreateStore(src.value(), src_lowered_ptr);
-      Type* unlowered_ptr_type =
-          CodegenAnyVal::GetUnloweredType(codegen, input_expr->type())->getPointerTo();
-      Value* src_unlowered_ptr =
-          builder.CreateBitCast(src_lowered_ptr, unlowered_ptr_type, "src_unlowered_ptr");
+      Value* src_unlowered_ptr = src.GetUnloweredPtr("src_unlowered_ptr");
 
       // Create StringVal* intermediate argument from dst_value
-      CodegenAnyVal dst_stringval = CodegenAnyVal::GetNonNullVal(
-          codegen, &builder, TYPE_STRING, "dst_stringval");
+      CodegenAnyVal dst_stringval =
+          CodegenAnyVal::GetNonNullVal(codegen, &builder, TYPE_STRING, "dst_stringval");
       dst_stringval.SetFromRawValue(dst_value);
       // Create pointer to dst_stringval to pass to HllUpdate() function. We must use
       // the unlowered type.
-      Value* dst_lowered_ptr = codegen->CreateEntryBlockAlloca(
-          fn, LlvmCodeGen::NamedVariable("dst_lowered_ptr",
-                                         dst_stringval.value()->getType()));
-      builder.CreateStore(dst_stringval.value(), dst_lowered_ptr);
-      unlowered_ptr_type =
+      Value* dst_lowered_ptr = dst_stringval.GetLoweredPtr("dst_lowered_ptr");
+      Type* dst_unlowered_ptr_type =
           codegen->GetPtrType(CodegenAnyVal::GetUnloweredType(codegen, TYPE_STRING));
-      Value* dst_unlowered_ptr =
-          builder.CreateBitCast(dst_lowered_ptr, unlowered_ptr_type, "dst_unlowered_ptr");
+      Value* dst_unlowered_ptr = builder.CreateBitCast(
+          dst_lowered_ptr, dst_unlowered_ptr_type, "dst_unlowered_ptr");
 
       // Call 'hll_fn'
-      builder.CreateCall(hll_fn,
-          ArrayRef<Value*>({fn_ctx_arg, src_unlowered_ptr, dst_unlowered_ptr}));
+      builder.CreateCall(
+          hll_fn, ArrayRef<Value*>({fn_ctx_arg, src_unlowered_ptr, dst_unlowered_ptr}));
 
       // Convert StringVal intermediate 'dst_arg' back to StringValue
       Value* anyval_result = builder.CreateLoad(dst_lowered_ptr, "anyval_result");
@@ -765,13 +760,17 @@ Function* AggregationNode::CodegenUpdateTuple(LlvmCodeGen* codegen) {
   //     ExprContext** expr_ctx, Tuple* tuple, TupleRow* row)
   // This signature needs to match the non-codegen'd signature exactly.
   StructType* tuple_struct = intermediate_tuple_desc_->GetLlvmStruct(codegen);
+  if (tuple_struct == NULL) {
+    VLOG_QUERY << "Could not codegen UpdateSlot(): could not generate tuple struct.";
+    return NULL;
+  }
   PointerType* tuple_ptr = PointerType::get(tuple_struct, 0);
   LlvmCodeGen::FnPrototype prototype(codegen, "UpdateTuple", codegen->void_type());
   prototype.AddArgument(LlvmCodeGen::NamedVariable("this_ptr", agg_node_ptr_type));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("agg_tuple", agg_tuple_ptr_type));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("tuple_row", tuple_row_ptr_type));
 
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
   Value* args[3];
   Function* fn = prototype.GeneratePrototype(&builder, &args[0]);
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/exec-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/exec-node.cc b/be/src/exec/exec-node.cc
index 510a7d9..255217f 100644
--- a/be/src/exec/exec-node.cc
+++ b/be/src/exec/exec-node.cc
@@ -519,7 +519,7 @@ Status ExecNode::CodegenEvalConjuncts(LlvmCodeGen* codegen,
       LlvmCodeGen::NamedVariable("num_ctxs", codegen->GetType(TYPE_INT)));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("row", tuple_row_ptr_type));
 
-  LlvmCodeGen::LlvmBuilder builder(codegen->context());
+  LlvmBuilder builder(codegen->context());
   Value* args[3];
   *fn = prototype.GeneratePrototype(&builder, args);
   Value* ctxs_arg = args[0];

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/hash-join-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hash-join-node.cc b/be/src/exec/hash-join-node.cc
index f2ac928..7c335be 100644
--- a/be/src/exec/hash-join-node.cc
+++ b/be/src/exec/hash-join-node.cc
@@ -548,7 +548,7 @@ Function* HashJoinNode::CodegenCreateOutputRow(LlvmCodeGen* codegen) {
   prototype.AddArgument(LlvmCodeGen::NamedVariable("build_arg", tuple_row_ptr_type));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[4];
   Function* fn = prototype.GeneratePrototype(&builder, args);
   Value* out_row_arg = builder.CreateBitCast(args[1], tuple_row_working_type, "out");
@@ -560,8 +560,9 @@ Function* HashJoinNode::CodegenCreateOutputRow(LlvmCodeGen* codegen) {
 
   // Copy probe row
   codegen->CodegenMemcpy(&builder, out_row_arg, probe_row_arg, probe_tuple_row_size_);
-  Value* build_row_idx[] = { codegen->GetIntConstant(TYPE_INT, num_probe_tuples) };
-  Value* build_row_dst = builder.CreateGEP(out_row_arg, build_row_idx, "build_dst_ptr");
+  Value* build_row_idx[] = {codegen->GetIntConstant(TYPE_INT, num_probe_tuples)};
+  Value* build_row_dst =
+      builder.CreateInBoundsGEP(out_row_arg, build_row_idx, "build_dst_ptr");
 
   // Copy build row.
   BasicBlock* build_not_null_block = BasicBlock::Create(context, "build_not_null", fn);
@@ -578,9 +579,8 @@ Function* HashJoinNode::CodegenCreateOutputRow(LlvmCodeGen* codegen) {
     // to work.
     builder.SetInsertPoint(build_null_block);
     for (int i = 0; i < num_build_tuples; ++i) {
-      Value* array_idx[] =
-          { codegen->GetIntConstant(TYPE_INT, i + num_probe_tuples) };
-      Value* dst = builder.CreateGEP(out_row_arg, array_idx, "dst_tuple_ptr");
+      Value* array_idx[] = {codegen->GetIntConstant(TYPE_INT, i + num_probe_tuples)};
+      Value* dst = builder.CreateInBoundsGEP(out_row_arg, array_idx, "dst_tuple_ptr");
       builder.CreateStore(codegen->null_ptr_value(), dst);
     }
     builder.CreateRetVoid();

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/hash-table.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hash-table.cc b/be/src/exec/hash-table.cc
index 74fc534..8d45ec7 100644
--- a/be/src/exec/hash-table.cc
+++ b/be/src/exec/hash-table.cc
@@ -567,8 +567,8 @@ string HashTable::PrintStats() const {
 // Helper function to store a value into the results buffer if the expr
 // evaluated to NULL.  We don't want (NULL, 1) to hash to the same as (0,1) so
 // we'll pick a more random value.
-static void CodegenAssignNullValue(LlvmCodeGen* codegen,
-    LlvmCodeGen::LlvmBuilder* builder, Value* dst, const ColumnType& type) {
+static void CodegenAssignNullValue(
+    LlvmCodeGen* codegen, LlvmBuilder* builder, Value* dst, const ColumnType& type) {
   uint64_t fnv_seed = HashUtil::FNV_SEED;
 
   if (type.type == TYPE_STRING || type.type == TYPE_VARCHAR) {
@@ -715,11 +715,11 @@ Status HashTableCtx::CodegenEvalRow(LlvmCodeGen* codegen, bool build, Function**
   prototype.AddArgument(LlvmCodeGen::NamedVariable("this_ptr", this_ptr_type));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("row", tuple_row_ptr_type));
   prototype.AddArgument(LlvmCodeGen::NamedVariable("expr_values", codegen->ptr_type()));
-  prototype.AddArgument(LlvmCodeGen::NamedVariable("expr_values_null",
-      codegen->ptr_type()));
+  prototype.AddArgument(
+      LlvmCodeGen::NamedVariable("expr_values_null", codegen->ptr_type()));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[4];
   *fn = prototype.GeneratePrototype(&builder, args);
   Value* this_ptr = args[0];
@@ -740,10 +740,10 @@ Status HashTableCtx::CodegenEvalRow(LlvmCodeGen* codegen, bool build, Function**
     // vector of exprs
     // Convert result buffer to llvm ptr type
     int offset = expr_values_cache_.expr_values_offsets(i);
-    Value* loc = builder.CreateGEP(
+    Value* loc = builder.CreateInBoundsGEP(
         NULL, expr_values, codegen->GetIntConstant(TYPE_INT, offset), "loc_addr");
-    Value* llvm_loc = builder.CreatePointerCast(loc,
-        codegen->GetPtrType(ctxs[i]->root()->type()), "loc");
+    Value* llvm_loc = builder.CreatePointerCast(
+        loc, codegen->GetPtrType(ctxs[i]->root()->type()), "loc");
 
     BasicBlock* null_block = BasicBlock::Create(context, "null", *fn);
     BasicBlock* not_null_block = BasicBlock::Create(context, "not_null", *fn);
@@ -768,7 +768,7 @@ Status HashTableCtx::CodegenEvalRow(LlvmCodeGen* codegen, bool build, Function**
 
     // Set null-byte result
     Value* null_byte = builder.CreateZExt(is_null, codegen->GetType(TYPE_TINYINT));
-    Value* llvm_null_byte_loc = builder.CreateGEP(
+    Value* llvm_null_byte_loc = builder.CreateInBoundsGEP(
         NULL, expr_values_null, codegen->GetIntConstant(TYPE_INT, i), "null_byte_loc");
     builder.CreateStore(null_byte, llvm_null_byte_loc);
     builder.CreateCondBr(is_null, null_block, not_null_block);
@@ -863,7 +863,7 @@ Status HashTableCtx::CodegenHashRow(LlvmCodeGen* codegen, bool use_murmur, Funct
       LlvmCodeGen::NamedVariable("expr_values_null", codegen->ptr_type()));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[3];
   *fn = prototype.GeneratePrototype(&builder, args);
   Value* this_arg = args[0];
@@ -910,7 +910,7 @@ Status HashTableCtx::CodegenHashRow(LlvmCodeGen* codegen, bool use_murmur, Funct
       Value* str_null_result = NULL;
 
       int offset = expr_values_cache_.expr_values_offsets(i);
-      Value* llvm_loc = builder.CreateGEP(
+      Value* llvm_loc = builder.CreateInBoundsGEP(
           NULL, expr_values, codegen->GetIntConstant(TYPE_INT, offset), "loc_addr");
 
       // If the hash table stores nulls, we need to check if the stringval
@@ -920,11 +920,11 @@ Status HashTableCtx::CodegenHashRow(LlvmCodeGen* codegen, bool use_murmur, Funct
         not_null_block = BasicBlock::Create(context, "not_null", *fn);
         continue_block = BasicBlock::Create(context, "continue", *fn);
 
-        Value* llvm_null_byte_loc = builder.CreateGEP(NULL, expr_values_null,
+        Value* llvm_null_byte_loc = builder.CreateInBoundsGEP(NULL, expr_values_null,
             codegen->GetIntConstant(TYPE_INT, i), "null_byte_loc");
         Value* null_byte = builder.CreateLoad(llvm_null_byte_loc, "null_byte");
-        Value* is_null = builder.CreateICmpNE(null_byte,
-            codegen->GetIntConstant(TYPE_TINYINT, 0), "is_null");
+        Value* is_null = builder.CreateICmpNE(
+            null_byte, codegen->GetIntConstant(TYPE_TINYINT, 0), "is_null");
         builder.CreateCondBr(is_null, null_block, not_null_block);
 
         // For null, we just want to call the hash function on the portion of
@@ -1071,7 +1071,7 @@ Status HashTableCtx::CodegenEquals(LlvmCodeGen* codegen, bool force_null_equalit
       LlvmCodeGen::NamedVariable("expr_values_null", codegen->ptr_type()));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[4];
   *fn = prototype.GeneratePrototype(&builder, args);
   Value* this_ptr = args[0];
@@ -1116,19 +1116,19 @@ Status HashTableCtx::CodegenEquals(LlvmCodeGen* codegen, bool force_null_equalit
     // We consider null values equal if we are comparing build rows or if the join
     // predicate is <=>
     if (force_null_equality || finds_nulls_[i]) {
-      Value* llvm_null_byte_loc = builder.CreateGEP(
+      Value* llvm_null_byte_loc = builder.CreateInBoundsGEP(
           NULL, expr_values_null, codegen->GetIntConstant(TYPE_INT, i), "null_byte_loc");
       Value* null_byte = builder.CreateLoad(llvm_null_byte_loc);
-      row_is_null = builder.CreateICmpNE(null_byte,
-          codegen->GetIntConstant(TYPE_TINYINT, 0));
+      row_is_null =
+          builder.CreateICmpNE(null_byte, codegen->GetIntConstant(TYPE_TINYINT, 0));
     }
 
     // Get llvm value for row_val from 'expr_values'
     int offset = expr_values_cache_.expr_values_offsets(i);
-    Value* loc = builder.CreateGEP(
+    Value* loc = builder.CreateInBoundsGEP(
         NULL, expr_values, codegen->GetIntConstant(TYPE_INT, offset), "loc");
-    Value* row_val = builder.CreatePointerCast(loc,
-        codegen->GetPtrType(build_expr_ctxs_[i]->root()->type()), "row_val");
+    Value* row_val = builder.CreatePointerCast(
+        loc, codegen->GetPtrType(build_expr_ctxs_[i]->root()->type()), "row_val");
 
     // Branch for GetValue() returning NULL
     builder.CreateCondBr(is_null, null_block, not_null_block);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/hdfs-avro-scanner.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-avro-scanner.cc b/be/src/exec/hdfs-avro-scanner.cc
index a073cef..c217b65 100644
--- a/be/src/exec/hdfs-avro-scanner.cc
+++ b/be/src/exec/hdfs-avro-scanner.cc
@@ -776,10 +776,10 @@ void HdfsAvroScanner::SetStatusValueOverflow(TErrorCode::type error_code, int64_
 // bail_out:           ; preds = %read_field11, %end_field3, %read_field2, %end_field,
 //   ret i1 false      ;         %read_field, %entry
 // }
-Status HdfsAvroScanner::CodegenMaterializeTuple(HdfsScanNodeBase* node,
-    LlvmCodeGen* codegen, Function** materialize_tuple_fn) {
+Status HdfsAvroScanner::CodegenMaterializeTuple(
+    HdfsScanNodeBase* node, LlvmCodeGen* codegen, Function** materialize_tuple_fn) {
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
 
   Type* this_type = codegen->GetType(HdfsAvroScanner::LLVM_CLASS_NAME);
   DCHECK(this_type != NULL);
@@ -853,8 +853,7 @@ Status HdfsAvroScanner::CodegenReadRecord(
   }
   DCHECK_EQ(record.schema->type, AVRO_RECORD);
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder* builder =
-      reinterpret_cast<LlvmCodeGen::LlvmBuilder*>(void_builder);
+  LlvmBuilder* builder = reinterpret_cast<LlvmBuilder*>(void_builder);
 
   // Codegen logic for parsing each field and, if necessary, populating a slot with the
   // result.
@@ -911,9 +910,8 @@ Status HdfsAvroScanner::CodegenReadRecord(
       // Write null field IR
       builder->SetInsertPoint(null_block);
       if (slot_idx != HdfsScanNode::SKIP_COLUMN) {
-        Function* set_null_fn = slot_desc->GetUpdateNullFn(codegen, true);
-        DCHECK(set_null_fn != NULL);
-        builder->CreateCall(set_null_fn, ArrayRef<Value*>({tuple_val}));
+        slot_desc->CodegenSetNullIndicator(
+            codegen, builder, tuple_val, codegen->true_value());
       }
       // LLVM requires all basic blocks to end with a terminating instruction
       builder->CreateBr(end_field_block);
@@ -944,11 +942,10 @@ Status HdfsAvroScanner::CodegenReadRecord(
 }
 
 Status HdfsAvroScanner::CodegenReadScalar(const AvroSchemaElement& element,
-    SlotDescriptor* slot_desc, LlvmCodeGen* codegen, void* void_builder,
-    Value* this_val, Value* pool_val, Value* tuple_val, Value* data_val,
-    Value* data_end_val, Value** ret_val) {
-  LlvmCodeGen::LlvmBuilder* builder =
-      reinterpret_cast<LlvmCodeGen::LlvmBuilder*>(void_builder);
+    SlotDescriptor* slot_desc, LlvmCodeGen* codegen, void* void_builder, Value* this_val,
+    Value* pool_val, Value* tuple_val, Value* data_val, Value* data_end_val,
+    Value** ret_val) {
+  LlvmBuilder* builder = reinterpret_cast<LlvmBuilder*>(void_builder);
   Function* read_field_fn;
   switch (element.schema->type) {
     case AVRO_BOOLEAN:

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/hdfs-scanner.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/hdfs-scanner.cc b/be/src/exec/hdfs-scanner.cc
index 3885522..390085f 100644
--- a/be/src/exec/hdfs-scanner.cc
+++ b/be/src/exec/hdfs-scanner.cc
@@ -380,7 +380,7 @@ Status HdfsScanner::CodegenWriteCompleteTuple(HdfsScanNodeBase* node,
   prototype.AddArgument(LlvmCodeGen::NamedVariable("error_in_row", uint8_ptr_type));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[8];
   Function* fn = prototype.GeneratePrototype(&builder, &args[0]);
 
@@ -411,8 +411,8 @@ Status HdfsScanner::CodegenWriteCompleteTuple(HdfsScanNodeBase* node,
   // Put tuple in tuple_row
   Value* tuple_row_typed =
       builder.CreateBitCast(tuple_row_arg, PointerType::get(tuple_ptr_type, 0));
-  Value* tuple_row_idxs[] = { codegen->GetIntConstant(TYPE_INT, node->tuple_idx()) };
-  Value* tuple_in_row_addr = builder.CreateGEP(tuple_row_typed, tuple_row_idxs);
+  Value* tuple_row_idxs[] = {codegen->GetIntConstant(TYPE_INT, node->tuple_idx())};
+  Value* tuple_in_row_addr = builder.CreateInBoundsGEP(tuple_row_typed, tuple_row_idxs);
   builder.CreateStore(tuple_arg, tuple_in_row_addr);
   builder.CreateBr(parse_block);
 
@@ -447,11 +447,12 @@ Status HdfsScanner::CodegenWriteCompleteTuple(HdfsScanNodeBase* node,
         codegen->GetIntConstant(TYPE_INT, 1),
       };
       Value* error_idxs[] = {
-        codegen->GetIntConstant(TYPE_INT, slot_idx),
+          codegen->GetIntConstant(TYPE_INT, slot_idx),
       };
-      Value* data_ptr = builder.CreateGEP(fields_arg, data_idxs, "data_ptr");
-      Value* len_ptr = builder.CreateGEP(fields_arg, len_idxs, "len_ptr");
-      Value* error_ptr = builder.CreateGEP(errors_arg, error_idxs, "slot_error_ptr");
+      Value* data_ptr = builder.CreateInBoundsGEP(fields_arg, data_idxs, "data_ptr");
+      Value* len_ptr = builder.CreateInBoundsGEP(fields_arg, len_idxs, "len_ptr");
+      Value* error_ptr =
+          builder.CreateInBoundsGEP(errors_arg, error_idxs, "slot_error_ptr");
       Value* data = builder.CreateLoad(data_ptr, "data");
       Value* len = builder.CreateLoad(len_ptr, "len");
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/old-hash-table.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/old-hash-table.cc b/be/src/exec/old-hash-table.cc
index f51bc74..44bf7e4 100644
--- a/be/src/exec/old-hash-table.cc
+++ b/be/src/exec/old-hash-table.cc
@@ -182,8 +182,8 @@ int OldHashTable::AddBloomFilters() {
 // Helper function to store a value into the results buffer if the expr
 // evaluated to NULL.  We don't want (NULL, 1) to hash to the same as (0,1) so
 // we'll pick a more random value.
-static void CodegenAssignNullValue(LlvmCodeGen* codegen,
-    LlvmCodeGen::LlvmBuilder* builder, Value* dst, const ColumnType& type) {
+static void CodegenAssignNullValue(
+    LlvmCodeGen* codegen, LlvmBuilder* builder, Value* dst, const ColumnType& type) {
   uint64_t fnv_seed = HashUtil::FNV_SEED;
 
   if (type.type == TYPE_STRING || type.type == TYPE_VARCHAR) {
@@ -288,7 +288,7 @@ Function* OldHashTable::CodegenEvalTupleRow(LlvmCodeGen* codegen, bool build) {
   prototype.AddArgument(LlvmCodeGen::NamedVariable("row", tuple_row_ptr_type));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[2];
   Function* fn = prototype.GeneratePrototype(&builder, args);
 
@@ -429,7 +429,7 @@ Function* OldHashTable::CodegenHashCurrentRow(LlvmCodeGen* codegen) {
   prototype.AddArgument(LlvmCodeGen::NamedVariable("this_ptr", this_ptr_type));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* this_arg;
   Function* fn = prototype.GeneratePrototype(&builder, &this_arg);
 
@@ -612,7 +612,7 @@ Function* OldHashTable::CodegenEquals(LlvmCodeGen* codegen) {
   prototype.AddArgument(LlvmCodeGen::NamedVariable("row", tuple_row_ptr_type));
 
   LLVMContext& context = codegen->context();
-  LlvmCodeGen::LlvmBuilder builder(context);
+  LlvmBuilder builder(context);
   Value* args[2];
   Function* fn = prototype.GeneratePrototype(&builder, args);
   Value* row = args[1];

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/d7246d64/be/src/exec/partitioned-aggregation-node-ir.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/partitioned-aggregation-node-ir.cc b/be/src/exec/partitioned-aggregation-node-ir.cc
index 83362e7..b696131 100644
--- a/be/src/exec/partitioned-aggregation-node-ir.cc
+++ b/be/src/exec/partitioned-aggregation-node-ir.cc
@@ -26,7 +26,7 @@
 
 using namespace impala;
 
-ExprContext* PartitionedAggregationNode::GetAggExprContext(int i) const {
+ExprContext* const* PartitionedAggregationNode::GetAggExprContexts(int i) const {
   return agg_expr_ctxs_[i];
 }
 


Mime
View raw message