impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tarmstr...@apache.org
Subject incubator-impala git commit: IMPALA-3674: Lazy materialization of LLVM module bitcode.
Date Thu, 21 Jul 2016 01:30:31 GMT
Repository: incubator-impala
Updated Branches:
  refs/heads/master 947305741 -> 276376aca


IMPALA-3674: Lazy materialization of LLVM module bitcode.

Previously, each fragment using dynamic code generation will
parse the bitcode module and populate the LLVM data structures
for all the functions and their bodies in the bitcode module.
This is wasteful as we may only use a few functions out of all
the functions parsed. We rely on dead code elimination to
delete most of the unused functions so we won't waste time
compiling them.

This change implements lazy materialization of the functions'
bodies. On the initial parse of the bitcode module, we just
create the Function objects for each function in the module.
The functions' bodies will be materialized on demand from the
bitcode module when they are actually referenced in the query.
This ensures that the prepare time during codegen is proportional
to the number of IR functions referenced by the query instead
of being proportional to the total number of IR functions in
the module.

This change also stops cross-compiling BufferedTupleStream::GetTupleRow()
as there isn't much benefit for doing it. In addition, move the ctors
and dtors of LikePredicate to the header file to avoid an unnecessary
alias in the IR module.

For TPCH-Q2, a fragment which only codegen 9 functions used to spend
146ms in codegen. It now goes down to 35ms, a 76% reduction.

      CodeGen:(Total: 146.041ms, non-child: 146.041ms, % non-child: 100.00%)
         - CodegenTime: 0.000ns
         - CompileTime: 2.003ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 2.12 MB (2225304)
         - NumFunctions: 9 (9)
         - NumInstructions: 129 (129)
         - OptimizationTime: 29.019ms
         - PrepareTime: 114.651ms

      CodeGen:(Total: 35.288ms, non-child: 35.288ms, % non-child: 100.00%)
         - CodegenTime: 0.000ns
         - CompileTime: 1.880ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 2.12 MB (2221276)
         - NumFunctions: 9 (9)
         - NumInstructions: 129 (129)
         - OptimizationTime: 5.101ms
         - PrepareTime: 28.044ms

Change-Id: I6ed7862fc5e86005ecea83fa2ceb489e737d66b2
Reviewed-on: http://gerrit.cloudera.org:8080/3220
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Internal Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/276376ac
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/276376ac
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/276376ac

Branch: refs/heads/master
Commit: 276376acac848b73c415b7c8fb95b64be451b967
Parents: 9473057
Author: Michael Ho <kwho@cloudera.com>
Authored: Thu Mar 24 16:35:44 2016 -0700
Committer: Tim Armstrong <tarmstrong@cloudera.com>
Committed: Wed Jul 20 18:30:25 2016 -0700

----------------------------------------------------------------------
 be/src/codegen/llvm-codegen-test.cc             |  17 +-
 be/src/codegen/llvm-codegen.cc                  | 194 +++++++++++++------
 be/src/codegen/llvm-codegen.h                   | 106 +++++++---
 be/src/exec/partitioned-aggregation-node.cc     |   2 +-
 be/src/exprs/expr-codegen-test.cc               |   2 +-
 be/src/exprs/like-predicate.cc                  |   7 -
 be/src/exprs/like-predicate.h                   |   6 +-
 be/src/exprs/scalar-fn-call.cc                  |  55 +++---
 be/src/runtime/buffered-tuple-stream.cc         |  30 +++
 be/src/runtime/buffered-tuple-stream.inline.h   |  30 ---
 be/src/testutil/test-udfs.cc                    |  19 +-
 be/src/util/symbols-util-test.cc                |   2 +
 .../functional-query/queries/QueryTest/udf.test |  14 ++
 tests/query_test/test_udfs.py                   |  10 +
 14 files changed, 326 insertions(+), 168 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/codegen/llvm-codegen-test.cc
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen-test.cc b/be/src/codegen/llvm-codegen-test.cc
index baf9921..7367c27 100644
--- a/be/src/codegen/llvm-codegen-test.cc
+++ b/be/src/codegen/llvm-codegen-test.cc
@@ -158,8 +158,8 @@ Function* CodegenInnerLoop(LlvmCodeGen* codegen, int64_t* jitted_counter,
int de
 //   6. Run the loop and make sure the updated is called.
 TEST_F(LlvmCodeGenTest, ReplaceFnCall) {
   ObjectPool pool;
-  const char* loop_call_name = "DefaultImplementation";
-  const char* loop_name = "TestLoop";
+  const string loop_call_name("_Z21DefaultImplementationv");
+  const string loop_name("_Z8TestLoopi");
   typedef void (*TestLoopFn)(int);
 
   string module_file;
@@ -170,16 +170,11 @@ TEST_F(LlvmCodeGenTest, ReplaceFnCall) {
   ASSERT_OK(LlvmCodeGenTest::CreateFromFile(&pool, module_file.c_str(), &codegen));
   EXPECT_TRUE(codegen.get() != NULL);
 
-  vector<Function*> functions;
-  codegen->GetFunctions(&functions);
-  EXPECT_EQ(functions.size(), 3);
-
-  Function* loop_call = functions[0];
-  Function* loop = functions[1];
-
-  EXPECT_TRUE(loop_call->getName().find(loop_call_name) != string::npos);
+  Function* loop_call = codegen->GetFunction(loop_call_name);
+  EXPECT_TRUE(loop_call != NULL);
   EXPECT_TRUE(loop_call->arg_empty());
-  EXPECT_TRUE(loop->getName().find(loop_name) != string::npos);
+  Function* loop = codegen->GetFunction(loop_name);
+  EXPECT_TRUE(loop != NULL);
   EXPECT_EQ(loop->arg_size(), 1);
 
   // Part 2: Generate a new inner loop function.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/codegen/llvm-codegen.cc
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen.cc b/be/src/codegen/llvm-codegen.cc
index 2bdbbcc..c47038e 100644
--- a/be/src/codegen/llvm-codegen.cc
+++ b/be/src/codegen/llvm-codegen.cc
@@ -28,9 +28,11 @@
 #include <llvm/Bitcode/ReaderWriter.h>
 #include <llvm/ExecutionEngine/ExecutionEngine.h>
 #include <llvm/ExecutionEngine/MCJIT.h>
+#include <llvm/IR/Constants.h>
 #include <llvm/IR/DataLayout.h>
-#include "llvm/IR/Function.h"
-#include "llvm/IR/InstIterator.h"
+#include <llvm/IR/Function.h>
+#include <llvm/IR/GlobalVariable.h>
+#include <llvm/IR/InstIterator.h>
 #include <llvm/IR/LegacyPassManager.h>
 #include <llvm/IR/NoFolder.h>
 #include <llvm/IR/Verifier.h>
@@ -73,7 +75,6 @@ using std::unique_ptr;
 
 DEFINE_bool(print_llvm_ir_instruction_count, false,
     "if true, prints the instruction counts of all JIT'd functions");
-
 DEFINE_bool(disable_optimization_passes, false,
     "if true, disables llvm optimization passes (used for testing)");
 DEFINE_bool(dump_ir, false, "if true, output IR after optimization passes");
@@ -164,15 +165,30 @@ Status LlvmCodeGen::CreateFromFile(ObjectPool* pool,
   return (*codegen)->Init(std::move(loaded_module));
 }
 
-Status LlvmCodeGen::CreateFromMemory(ObjectPool* pool, MemoryBufferRef module_ir,
-    const string& module_name, const string& id, scoped_ptr<LlvmCodeGen>* codegen)
{
+Status LlvmCodeGen::CreateFromMemory(ObjectPool* pool, const string& id,
+    scoped_ptr<LlvmCodeGen>* codegen) {
   codegen->reset(new LlvmCodeGen(pool, id));
   SCOPED_TIMER((*codegen)->profile_.total_time_counter());
 
-  unique_ptr<Module> loaded_module;
-  RETURN_IF_ERROR(
-      (*codegen)->LoadModuleFromMemory(module_ir, module_name, &loaded_module));
+  // Select the appropriate IR version. We cannot use LLVM IR with SSE4.2 instructions on
+  // a machine without SSE4.2 support.
+  StringRef module_ir;
+  string module_name;
+  if (CpuInfo::IsSupported(CpuInfo::SSE4_2)) {
+    module_ir = StringRef(reinterpret_cast<const char*>(impala_sse_llvm_ir),
+        impala_sse_llvm_ir_len);
+    module_name = "Impala IR with SSE 4.2 support";
+  } else {
+    module_ir = StringRef(reinterpret_cast<const char*>(impala_no_sse_llvm_ir),
+        impala_no_sse_llvm_ir_len);
+    module_name = "Impala IR with no SSE 4.2 support";
+  }
 
+  unique_ptr<MemoryBuffer> module_ir_buf(
+      MemoryBuffer::getMemBuffer(module_ir, "", false));
+  unique_ptr<Module> loaded_module;
+  RETURN_IF_ERROR((*codegen)->LoadModuleFromMemory(std::move(module_ir_buf),
+      module_name, &loaded_module));
   return (*codegen)->Init(std::move(loaded_module));
 }
 
@@ -192,15 +208,16 @@ Status LlvmCodeGen::LoadModuleFromFile(const string& file, unique_ptr<Module>*
m
   }
 
   COUNTER_ADD(module_bitcode_size_, file_buffer->getBufferSize());
-  return LoadModuleFromMemory(file_buffer->getMemBufferRef(), file, module);
+  return LoadModuleFromMemory(std::move(file_buffer), file, module);
 }
 
-Status LlvmCodeGen::LoadModuleFromMemory(MemoryBufferRef module_ir, string module_name,
-    unique_ptr<Module>* module) {
+Status LlvmCodeGen::LoadModuleFromMemory(unique_ptr<MemoryBuffer> module_ir_buf,
+    string module_name, unique_ptr<Module>* module) {
   DCHECK(!module_name.empty());
   SCOPED_TIMER(prepare_module_timer_);
-  ErrorOr<unique_ptr<Module>> tmp_module =
-      parseBitcodeFile(module_ir, context());
+  ErrorOr<unique_ptr<Module>> tmp_module(NULL);
+  COUNTER_ADD(module_bitcode_size_, module_ir_buf->getMemBufferRef().getBufferSize());
+  tmp_module = getLazyBitcodeModule(std::move(module_ir_buf), context(), false);
   if (!tmp_module) {
     stringstream ss;
     ss << "Could not parse module " << module_name << ": " << tmp_module.getError();
@@ -214,7 +231,6 @@ Status LlvmCodeGen::LoadModuleFromMemory(MemoryBufferRef module_ir, string
modul
   StripGlobalCtorsDtors((*module).get());
 
   (*module)->setModuleIdentifier(module_name);
-  COUNTER_ADD(module_bitcode_size_, module_ir.getBufferSize());
   return Status::OK();
 }
 
@@ -248,24 +264,7 @@ void LlvmCodeGen::StripGlobalCtorsDtors(llvm::Module* module) {
 
 Status LlvmCodeGen::CreateImpalaCodegen(
     ObjectPool* pool, const string& id, scoped_ptr<LlvmCodeGen>* codegen_ret) {
-  // Select the appropriate IR version.  We cannot use LLVM IR with sse instructions on
-  // a machine without sse support (loading the module will fail regardless of whether
-  // those instructions are run or not).
-  StringRef module_ir;
-  string module_name;
-  if (CpuInfo::IsSupported(CpuInfo::SSE4_2)) {
-    module_ir = StringRef(reinterpret_cast<const char*>(impala_sse_llvm_ir),
-        impala_sse_llvm_ir_len);
-    module_name = "Impala IR with SSE support";
-  } else {
-    module_ir = StringRef(reinterpret_cast<const char*>(impala_no_sse_llvm_ir),
-        impala_no_sse_llvm_ir_len);
-    module_name = "Impala IR with no SSE support";
-  }
-  unique_ptr<MemoryBuffer> module_ir_buf(
-      MemoryBuffer::getMemBuffer(module_ir, "", false));
-  RETURN_IF_ERROR(CreateFromMemory(pool, module_ir_buf->getMemBufferRef(), module_name,
id,
-      codegen_ret));
+  RETURN_IF_ERROR(CreateFromMemory(pool, id, codegen_ret));
   LlvmCodeGen* codegen = codegen_ret->get();
 
   // Parse module for cross compiled functions and types
@@ -287,9 +286,12 @@ Status LlvmCodeGen::CreateImpalaCodegen(
     return Status("Could not create llvm struct type for StringVal");
   }
 
-  // Parse functions from module
+  // Fills 'functions' with all the cross-compiled functions that are defined in
+  // the module.
   vector<Function*> functions;
-  codegen->GetFunctions(&functions);
+  for (Function& fn: codegen->module_->functions()) {
+    if (fn.isMaterializable()) functions.push_back(&fn);
+  }
   int parsed_functions = 0;
   for (int i = 0; i < functions.size(); ++i) {
     string fn_name = functions[i]->getName();
@@ -524,18 +526,55 @@ void LlvmCodeGen::CreateIfElseBlocks(Function* fn, const string&
if_name,
   *else_block = BasicBlock::Create(context(), else_name, fn, insert_before);
 }
 
-Function* LlvmCodeGen::GetLibCFunction(FnPrototype* prototype) {
-  if (external_functions_.find(prototype->name()) != external_functions_.end()) {
-    return external_functions_[prototype->name()];
+Status LlvmCodeGen::MaterializeFunctionHelper(Function *fn) {
+  DCHECK(!is_compiled_);
+  if (fn->isIntrinsic() || !fn->isMaterializable()) return Status::OK();
+
+  std::error_code err = module_->materialize(fn);
+  if (UNLIKELY(err)) {
+    return Status(Substitute("Failed to materialize $0: $1",
+        fn->getName().str(), err.message()));
   }
-  Function* func = prototype->GeneratePrototype();
-  external_functions_[prototype->name()] = func;
-  return func;
+
+  // Materialized functions are marked as not materializable by LLVM.
+  DCHECK(!fn->isMaterializable());
+  for (inst_iterator iter = inst_begin(fn); iter != inst_end(fn); ++iter) {
+    Instruction* instr = &*iter;
+    Function* called_fn = NULL;
+    if (isa<CallInst>(instr)) {
+      CallInst* call_instr = reinterpret_cast<CallInst*>(instr);
+      called_fn = call_instr->getCalledFunction();
+    } else if (isa<InvokeInst>(instr)) {
+      InvokeInst* invoke_instr = reinterpret_cast<InvokeInst*>(instr);
+      called_fn = invoke_instr->getCalledFunction();
+    }
+    if (called_fn != NULL) MaterializeFunctionHelper(called_fn);
+  }
+  return Status::OK();
 }
 
-Function* LlvmCodeGen::GetFunction(IRFunction::Type function, bool clone) {
-  DCHECK(loaded_functions_[function] != NULL);
-  Function* fn = loaded_functions_[function];
+Status LlvmCodeGen::MaterializeFunction(Function *fn) {
+  SCOPED_TIMER(profile_.total_time_counter());
+  SCOPED_TIMER(prepare_module_timer_);
+  return MaterializeFunctionHelper(fn);
+}
+
+Function* LlvmCodeGen::GetFunction(const string& symbol) {
+  Function* fn = module_->getFunction(symbol.c_str());
+  if (fn == NULL) {
+    LOG(ERROR) << "Unable to locate function " << symbol;
+    return NULL;
+  }
+  Status status = MaterializeFunction(fn);
+  if (UNLIKELY(!status.ok())) return NULL;
+  return fn;
+}
+
+Function* LlvmCodeGen::GetFunction(IRFunction::Type ir_type, bool clone) {
+  DCHECK(loaded_functions_[ir_type] != NULL);
+  Function* fn = loaded_functions_[ir_type];
+  Status status = MaterializeFunction(fn);
+  if (UNLIKELY(!status.ok())) return NULL;
   if (clone) return CloneFunction(fn);
   return fn;
 }
@@ -590,14 +629,13 @@ bool LlvmCodeGen::VerifyFunction(Function* fn) {
   return true;
 }
 
-LlvmCodeGen::FnPrototype::FnPrototype(
-    LlvmCodeGen* gen, const string& name, Type* ret_type) :
-  codegen_(gen), name_(name), ret_type_(ret_type) {
+LlvmCodeGen::FnPrototype::FnPrototype(LlvmCodeGen* codegen, const string& name,
+    Type* ret_type) : codegen_(codegen), name_(name), ret_type_(ret_type) {
   DCHECK(!codegen_->is_compiled_) << "Not valid to add additional functions";
 }
 
 Function* LlvmCodeGen::FnPrototype::GeneratePrototype(
-      LlvmBuilder* builder, Value** params) {
+    LlvmBuilder* builder, Value** params, bool print_ir) {
   vector<Type*> arguments;
   for (int i = 0; i < args_.size(); ++i) {
     arguments.push_back(args_[i].type);
@@ -605,7 +643,7 @@ Function* LlvmCodeGen::FnPrototype::GeneratePrototype(
   FunctionType* prototype = FunctionType::get(ret_type_, arguments, false);
 
   Function* fn = Function::Create(
-      prototype, Function::ExternalLinkage, name_, codegen_->module_);
+      prototype, GlobalValue::ExternalLinkage, name_, codegen_->module_);
   DCHECK(fn != NULL);
 
   // Name the arguments
@@ -621,7 +659,7 @@ Function* LlvmCodeGen::FnPrototype::GeneratePrototype(
     builder->SetInsertPoint(entry_block);
   }
 
-  codegen_->codegend_functions_.push_back(fn);
+  if (print_ir) codegen_->codegend_functions_.push_back(fn);
   return fn;
 }
 
@@ -686,6 +724,9 @@ void LlvmCodeGen::FindCallSites(Function* caller, const string& target_name,
 Function* LlvmCodeGen::CloneFunction(Function* fn) {
   DCHECK(!is_compiled_);
   ValueToValueMapTy dummy_vmap;
+  // Verifies that 'fn' has been materialized already. Callers are expected to use
+  // GetFunction() to obtain the Function object.
+  DCHECK(!fn->isMaterializable());
   // CloneFunction() automatically gives the new function a unique name
   Function* fn_clone = llvm::CloneFunction(fn, dummy_vmap, false);
   fn_clone->copyAttributesFrom(fn);
@@ -694,7 +735,9 @@ Function* LlvmCodeGen::CloneFunction(Function* fn) {
 }
 
 Function* LlvmCodeGen::FinalizeFunction(Function* function) {
-  function->addFnAttr(llvm::Attribute::AlwaysInline);
+  if (LIKELY(!function->hasFnAttribute(llvm::Attribute::NoInline))) {
+    function->addFnAttr(llvm::Attribute::AlwaysInline);
+  }
 
   if (!VerifyFunction(function)) {
     function->eraseFromParent(); // deletes function
@@ -704,6 +747,42 @@ Function* LlvmCodeGen::FinalizeFunction(Function* function) {
   return function;
 }
 
+Status LlvmCodeGen::MaterializeModule(Module* module) {
+  std::error_code err = module->materializeAll();
+  if (UNLIKELY(err)) {
+    stringstream err_msg;
+    err_msg << "Failed to complete materialization of module " << module->getName().str()
+        << ": " << err.message();
+    return Status(err_msg.str());
+  }
+  return Status::OK();
+}
+
+// It's okay to call this function even if the module has been materialized.
+Status LlvmCodeGen::FinalizeLazyMaterialization() {
+  SCOPED_TIMER(prepare_module_timer_);
+  for (Function& fn: module_->functions()) {
+    if (fn.isMaterializable()) {
+      DCHECK(!module_->isMaterialized());
+      // Unmaterialized functions can still have their declarations around. LLVM asserts
+      // these unmaterialized functions' linkage types are external / external weak.
+      fn.setLinkage(Function::ExternalLinkage);
+      // DCE may claim the personality function is still referenced by unmaterialized
+      // functions when it is deleted by DCE. Similarly, LLVM may complain if comdats
+      // reference unmaterialized functions but their definition cannot be found.
+      // Since the unmaterialized functions are not used anyway, just remove their
+      // personality functions and comdats.
+      fn.setPersonalityFn(NULL);
+      fn.setComdat(NULL);
+      fn.setIsMaterializable(false);
+    }
+  }
+  // All unused functions are now not materializable so it should be quick to call
+  // materializeAll(). We need to call this function in order to destroy the
+  // materializer so that DCE will not assert fail.
+  return MaterializeModule(module_);
+}
+
 Status LlvmCodeGen::FinalizeModule() {
   DCHECK(!is_compiled_);
   is_compiled_ = true;
@@ -726,6 +805,7 @@ Status LlvmCodeGen::FinalizeModule() {
   // if the codegen object is created but no functions are successfully codegen'd.
   if (fns_to_jit_compile_.empty()) return Status::OK();
 
+  RETURN_IF_ERROR(FinalizeLazyMaterialization());
   if (optimizations_enabled_ && !FLAGS_disable_optimization_passes) OptimizeModule();
 
   if (FLAGS_opt_module_dir.size() != 0) {
@@ -874,7 +954,7 @@ void LlvmCodeGen::CodegenDebugTrace(LlvmBuilder* builder, const char*
str,
   debug_strings_.push_back(Substitute("LLVM Trace: $0", str));
   str = debug_strings_.back().c_str();
 
-  Function* printf = module()->getFunction("printf");
+  Function* printf = module_->getFunction("printf");
   DCHECK(printf != NULL);
 
   // Call printf by turning 'str' into a constant ptr value
@@ -886,15 +966,9 @@ void LlvmCodeGen::CodegenDebugTrace(LlvmBuilder* builder, const char*
str,
   builder->CreateCall(printf, calling_args);
 }
 
-void LlvmCodeGen::GetFunctions(vector<Function*>* functions) {
-  for (Function& fn: module_->functions()) {
-    if (!fn.empty()) functions->push_back(&fn);
-  }
-}
-
 void LlvmCodeGen::GetSymbols(unordered_set<string>* symbols) {
   for (const Function& fn: module_->functions()) {
-    if (!fn.empty()) symbols->insert(fn.getName());
+    if (fn.isMaterializable()) symbols->insert(fn.getName());
   }
 }
 
@@ -982,7 +1056,7 @@ Status LlvmCodeGen::LoadIntrinsics() {
   // Load memcpy
   {
     Type* types[] = { ptr_type(), ptr_type(), GetType(TYPE_INT) };
-    Function* fn = Intrinsic::getDeclaration(module(), Intrinsic::memcpy, types);
+    Function* fn = Intrinsic::getDeclaration(module_, Intrinsic::memcpy, types);
     if (fn == NULL) {
       return Status("Could not find memcpy intrinsic.");
     }
@@ -1004,7 +1078,7 @@ Status LlvmCodeGen::LoadIntrinsics() {
 
   for (int i = 0; i < num_intrinsics; ++i) {
     Intrinsic::ID id = non_overloaded_intrinsics[i].id;
-    Function* fn = Intrinsic::getDeclaration(module(), id);
+    Function* fn = Intrinsic::getDeclaration(module_, id);
     if (fn == NULL) {
       stringstream ss;
       ss << "Could not find " << non_overloaded_intrinsics[i].error <<
" intrinsic";

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/codegen/llvm-codegen.h
----------------------------------------------------------------------
diff --git a/be/src/codegen/llvm-codegen.h b/be/src/codegen/llvm-codegen.h
index e84ea74..adfcc90 100644
--- a/be/src/codegen/llvm-codegen.h
+++ b/be/src/codegen/llvm-codegen.h
@@ -103,10 +103,19 @@ class SubExprElimination;
 /// functions from across modules.
 //
 /// LLVM has a nontrivial memory management scheme and objects will take
-/// ownership of others.  The document is pretty good about being explicit with this
+/// ownership of others. The document is pretty good about being explicit with this
 /// but it is not very intuitive.
 /// TODO: look into diagnostic output and debuggability
 /// TODO: confirm that the multi-threaded usage is correct
+//
+/// llvm::Function objects in the module are materialized lazily to save the cost of
+/// parsing IR of functions which are dead code. An unmaterialized function is similar
+/// to a function declaration which only contains the function signature and needs to
+/// be materialized before optimization and compilation happen if it's not dead code.
+/// Materializing a function means parsing the bitcode to populate the basic blocks and
+/// instructions attached to the function object. Functions reachable by the function
+/// are also materialized recursively.
+//
 class LlvmCodeGen {
  public:
   /// This function must be called once per process before any llvm API calls are
@@ -164,7 +173,7 @@ class LlvmCodeGen {
    public:
     /// Create a function prototype object, specifying the name of the function and
     /// the return type.
-    FnPrototype(LlvmCodeGen*, const std::string& name, llvm::Type* ret_type);
+    FnPrototype(LlvmCodeGen* codegen, const std::string& name, llvm::Type* ret_type);
 
     /// Returns name of function
     const std::string& name() const { return name_; }
@@ -178,13 +187,18 @@ class LlvmCodeGen {
     }
 
     /// Generate LLVM function prototype.
-    /// If a non-null builder is passed, this function will also create the entry block
+    /// If a non-null 'builder' is passed, this function will also create the entry block
     /// and set the builder's insert point to there.
-    /// If params is non-null, this function will also return the arguments
-    /// values (params[0] is the first arg, etc).
-    /// In that case, params should be preallocated to be number of arguments
+    ///
+    /// If 'params' is non-null, this function will also return the arguments values
+    /// (params[0] is the first arg, etc). In that case, 'params' should be preallocated
+    /// to be number of arguments
+    ///
+    /// If 'print_ir' is true, the generated llvm::Function's IR will be printed when
+    /// GetIR() is called. Avoid doing so for IR function prototypes generated for
+    /// externally defined native function.
     llvm::Function* GeneratePrototype(LlvmBuilder* builder = NULL,
-        llvm::Value** params = NULL);
+        llvm::Value** params = NULL, bool print_ir = true);
 
    private:
     friend class LlvmCodeGen;
@@ -227,9 +241,6 @@ class LlvmCodeGen {
   /// Returns execution engine interface
   llvm::ExecutionEngine* execution_engine() { return execution_engine_.get(); }
 
-  /// Returns the underlying llvm module
-  llvm::Module* module() { return module_; }
-
   /// Register a expr function with unique id.  It can be subsequently retrieved via
   /// GetRegisteredExprFn with that id.
   void RegisterExprFn(int64_t id, llvm::Function* function) {
@@ -307,17 +318,23 @@ class LlvmCodeGen {
     return str;
   }
 
-  /// Returns the libc function, adding it to the module if it has not already been.
-  llvm::Function* GetLibCFunction(FnPrototype* prototype);
-
-  /// Returns the cross compiled function. IRFunction::Type is an enum which is generated
-  /// by gen_ir_descriptions.py.
+  /// Returns the cross compiled function. 'ir_type' is an enum which is generated
+  /// by gen_ir_descriptions.py. The returned function and its callee will be materialized
+  /// recursively. Returns NULL if there is any error.
   ///
   /// If 'clone' is true, a clone of the function will be returned. Clones should be used
   /// iff the caller will modify the returned function. This avoids clobbering the
   /// function in case other users need it, but we don't clone if we can avoid it to
   /// reduce compilation time.
-  llvm::Function* GetFunction(IRFunction::Type, bool clone);
+  ///
+  /// TODO: Return Status instead.
+  llvm::Function* GetFunction(IRFunction::Type ir_type, bool clone);
+
+  /// Return the function with the symbol name 'symbol' from the module. The returned
+  /// function and its callee will be recursively materialized. The returned function
+  /// isn't cloned. Returns NULL if there is any error.
+  /// TODO: Return Status instead.
+  llvm::Function* GetFunction(const string& symbol);
 
   /// Returns the hash function with signature:
   ///   int32_t Hash(int8_t* data, int len, int32_t seed);
@@ -379,10 +396,6 @@ class LlvmCodeGen {
   llvm::Type* void_type() { return void_type_; }
   llvm::Type* i128_type() { return llvm::Type::getIntNTy(context(), 128); }
 
-  /// Fills 'functions' with all the functions that are defined in the module.
-  /// Note: this does not include functions that are just declared
-  void GetFunctions(std::vector<llvm::Function*>* functions);
-
   /// Fils in 'symbols' with all the symbols in the module.
   void GetSymbols(boost::unordered_set<std::string>* symbols);
 
@@ -425,21 +438,26 @@ class LlvmCodeGen {
   /// Initializes the jitter and execution engine with the given module.
   Status Init(std::unique_ptr<llvm::Module> module);
 
-  /// Creates a LlvmCodeGen instance initialized with the module bitcode from 'module_ir'.
-  /// 'codegen' will contain the created object on success.
-  static Status CreateFromMemory(ObjectPool* pool, llvm::MemoryBufferRef module_ir,
-      const std::string& module_name, const std::string& id,
+  /// Creates a LlvmCodeGen instance initialized with the module bitcode in memory.
+  /// 'codegen' will contain the created object on success. Note that the functions
+  /// are not materialized. Getting a reference to the function via GetFunction()
+  /// will materialize the function and its callees recursively.
+  static Status CreateFromMemory(ObjectPool* pool, const std::string& id,
       boost::scoped_ptr<LlvmCodeGen>* codegen);
 
-  /// Loads an LLVM module. 'file' should be the local path to the LLVM bitcode
-  /// file. The caller is responsible for cleaning up module.
+  /// Loads an LLVM module from 'file' which is the local path to the LLVM bitcode file.
+  /// The functions in the module are not materialized. Getting a reference to the
+  /// function via GetFunction() will materialize the function and its callees
+  /// recursively. The caller is responsible for cleaning up the module.
   Status LoadModuleFromFile(const string& file, std::unique_ptr<llvm::Module>*
module);
 
-  /// Loads an LLVM module. 'module_ir' should be a reference to a memory buffer containing
-  /// LLVM bitcode. module_name is the name of the module to use when reporting errors.
-  /// The caller is responsible for cleaning up module.
-  Status LoadModuleFromMemory(llvm::MemoryBufferRef module_ir, std::string module_name,
-      std::unique_ptr<llvm::Module>* module);
+  /// Loads an LLVM module. 'module_ir_buf' is the memory buffer containing LLVM bitcode.
+  /// 'module_name' is the name of the module to use when reporting errors.
+  /// The caller is responsible for cleaning up 'module'. The functions in the module
+  /// aren't materialized. Getting a reference to the function via GetFunction() will
+  /// materialize the function and its callees recursively.
+  Status LoadModuleFromMemory(std::unique_ptr<llvm::MemoryBuffer> module_ir_buf,
+      std::string module_name, std::unique_ptr<llvm::Module>* module);
 
   /// Strip global constructors and destructors from an LLVM module. We never run them
   /// anyway (they must be explicitly invoked) so it is dead code.
@@ -487,6 +505,32 @@ class LlvmCodeGen {
   static std::string cpu_name_;
   static std::vector<std::string> cpu_attrs_;
 
+  /// This is the workhorse for materializing function 'fn'. It's invoked by
+  /// MaterializeFunction(). Calls LLVM to materialize 'fn' if it's materializable
+  /// (i.e. the function has a definition in the module and it's not materialized yet).
+  /// This function parses the bitcode of 'fn' to populate basic blocks, instructions
+  /// and other data structures attached to the function object. Return error status
+  /// for any error.
+  Status MaterializeFunctionHelper(llvm::Function* fn);
+
+  /// Entry point for materializing function 'fn'. Invokes MaterializeFunctionHelper()
+  /// to do the actual work. Return error status for any error.
+  Status MaterializeFunction(llvm::Function* fn);
+
+  /// Materialize the given module by materializing all its unmaterialized functions
+  /// and deleting the module's materializer. Returns error status for any error.
+  Status MaterializeModule(llvm::Module* module);
+
+  /// With lazy materialization, functions which haven't been materialized when the module
+  /// is finalized must be dead code or referenced only by global variables (e.g. boost
+  /// library functions or virtual function (e.g. IfExpr::GetBooleanVal())), in which case
+  /// the function is not inlined so the native version can be used and the IR version is
+  /// dead code. Mark them as not materializable, change their linkage types to external
+  /// (so linking will happen to the native version) and strip their personality functions
+  /// and comdats. DCE may complain if the above are not done. Return error status if
+  /// there is error in materializing the module.
+  Status FinalizeLazyMaterialization();
+
   /// ID used for debugging (can be e.g. the fragment instance ID)
   std::string id_;
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/exec/partitioned-aggregation-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/partitioned-aggregation-node.cc b/be/src/exec/partitioned-aggregation-node.cc
index 33ddfc4..6be56f2 100644
--- a/be/src/exec/partitioned-aggregation-node.cc
+++ b/be/src/exec/partitioned-aggregation-node.cc
@@ -1588,7 +1588,7 @@ Status PartitionedAggregationNode::CodegenUpdateSlot(
       const string& symbol = evaluator->is_merge() ?
                              evaluator->merge_symbol() : evaluator->update_symbol();
       const ColumnType& dst_type = evaluator->intermediate_type();
-      Function* ir_fn = codegen->module()->getFunction(symbol);
+      Function* ir_fn = codegen->GetFunction(symbol);
       DCHECK(ir_fn != NULL);
 
       // Clone and replace constants.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/exprs/expr-codegen-test.cc
----------------------------------------------------------------------
diff --git a/be/src/exprs/expr-codegen-test.cc b/be/src/exprs/expr-codegen-test.cc
index c660bfd..3e5a2af 100644
--- a/be/src/exprs/expr-codegen-test.cc
+++ b/be/src/exprs/expr-codegen-test.cc
@@ -249,7 +249,7 @@ TEST_F(ExprCodegenTest, TestInlineConstants) {
   test_udf_file << getenv("IMPALA_HOME") << "/be/build/latest/exprs/expr-codegen-test.ll";
   scoped_ptr<LlvmCodeGen> codegen;
   ASSERT_OK(LlvmCodeGen::CreateFromFile(&pool, test_udf_file.str(), "test", &codegen));
-  Function* fn = codegen->module()->getFunction(TEST_GET_CONSTANT_SYMBOL);
+  Function* fn = codegen->GetFunction(TEST_GET_CONSTANT_SYMBOL);
   ASSERT_TRUE(fn != NULL);
 
   // Function verification should fail because we haven't inlined GetConstant() calls

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/exprs/like-predicate.cc
----------------------------------------------------------------------
diff --git a/be/src/exprs/like-predicate.cc b/be/src/exprs/like-predicate.cc
index fd73089..d008925 100644
--- a/be/src/exprs/like-predicate.cc
+++ b/be/src/exprs/like-predicate.cc
@@ -45,13 +45,6 @@ static const RE2 STARTS_WITH_RE(
 // A regex to match any regex pattern which is equivalent to a constant string match.
 static const RE2 EQUALS_RE("\\^([^\\.\\^\\{\\[\\(\\|\\)\\]\\}\\+\\*\\?\\$\\\\]*)\\$");
 
-LikePredicate::LikePredicate(const TExprNode& node)
-  : Predicate(node) {
-}
-
-LikePredicate::~LikePredicate() {
-}
-
 void LikePredicate::LikePrepare(FunctionContext* context,
     FunctionContext::FunctionStateScope scope) {
   LikePrepareInternal(context, scope, true);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/exprs/like-predicate.h
----------------------------------------------------------------------
diff --git a/be/src/exprs/like-predicate.h b/be/src/exprs/like-predicate.h
index 90c78ec..91e16cd 100644
--- a/be/src/exprs/like-predicate.h
+++ b/be/src/exprs/like-predicate.h
@@ -34,11 +34,13 @@ namespace impala {
 /// This class handles the Like, Regexp, and Rlike predicates and uses the udf interface.
 class LikePredicate: public Predicate {
  public:
-  ~LikePredicate();
+  ~LikePredicate() { }
 
  protected:
   friend class Expr;
-  LikePredicate(const TExprNode& node);
+
+  LikePredicate(const TExprNode& node)
+      : Predicate(node) { }
 
  private:
   typedef impala_udf::BooleanVal (*LikePredicateFunction) (impala_udf::FunctionContext*,

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/exprs/scalar-fn-call.cc
----------------------------------------------------------------------
diff --git a/be/src/exprs/scalar-fn-call.cc b/be/src/exprs/scalar-fn-call.cc
index 9a35ba1..0dfd240 100644
--- a/be/src/exprs/scalar-fn-call.cc
+++ b/be/src/exprs/scalar-fn-call.cc
@@ -423,54 +423,61 @@ Status ScalarFnCall::GetUdf(RuntimeState* state, llvm::Function** udf)
{
     if (!status.ok() && fn_.binary_type == TFunctionBinaryType::BUILTIN) {
       // Builtins symbols should exist unless there is a version mismatch.
       status.AddDetail(ErrorMsg(TErrorCode::MISSING_BUILTIN,
-              fn_.name.function_name, fn_.scalar_fn.symbol).msg());
+          fn_.name.function_name, fn_.scalar_fn.symbol).msg());
     }
     RETURN_IF_ERROR(status);
     DCHECK(fn_ptr != NULL);
 
-    // Convert UDF function pointer to llvm::Function*
-    // First generate the llvm::FunctionType* corresponding to the UDF.
-    llvm::Type* return_type = CodegenAnyVal::GetLoweredType(codegen, type());
-    vector<llvm::Type*> arg_types;
+    // Per the x64 ABI, DecimalVals are returned via a DecmialVal* output argument.
+    // So, the return type is void.
+    bool is_decimal = type().type == TYPE_DECIMAL;
+    llvm::Type* return_type = is_decimal ? codegen->void_type() :
+        CodegenAnyVal::GetLoweredType(codegen, type());
 
-    if (type().type == TYPE_DECIMAL) {
+    // Convert UDF function pointer to llvm::Function*. Start by creating a function
+    // prototype for it.
+    LlvmCodeGen::FnPrototype prototype(codegen, fn_.scalar_fn.symbol, return_type);
+
+    if (is_decimal) {
       // Per the x64 ABI, DecimalVals are returned via a DecmialVal* output argument
-      return_type = codegen->void_type();
-      arg_types.push_back(
-          codegen->GetPtrType(CodegenAnyVal::GetUnloweredType(codegen, type())));
+      llvm::Type* output_type =
+          codegen->GetPtrType(CodegenAnyVal::GetUnloweredType(codegen, type()));
+      prototype.AddArgument("output", output_type);
     }
 
-    arg_types.push_back(codegen->GetPtrType("class.impala_udf::FunctionContext"));
+    // The "FunctionContext*" argument.
+    prototype.AddArgument("ctx",
+        codegen->GetPtrType("class.impala_udf::FunctionContext"));
+
+    // The "fixed" arguments for the UDF function.
     for (int i = 0; i < NumFixedArgs(); ++i) {
+      stringstream arg_name;
+      arg_name << "fixed_arg_" << i;
       llvm::Type* arg_type = codegen->GetPtrType(
           CodegenAnyVal::GetUnloweredType(codegen, children_[i]->type()));
-      arg_types.push_back(arg_type);
+      prototype.AddArgument(arg_name.str(), arg_type);
     }
-
+    // The varargs for the UDF function if there is any.
     if (vararg_start_idx_ >= 0) {
       llvm::Type* vararg_type = CodegenAnyVal::GetUnloweredPtrType(
           codegen, children_[vararg_start_idx_]->type());
-      arg_types.push_back(codegen->GetType(TYPE_INT));
-      arg_types.push_back(vararg_type);
+      prototype.AddArgument("num_var_arg", codegen->GetType(TYPE_INT));
+      prototype.AddArgument("var_arg", vararg_type);
     }
-    llvm::FunctionType* udf_type = llvm::FunctionType::get(return_type, arg_types, false);
 
     // Create a llvm::Function* with the generated type. This is only a function
     // declaration, not a definition, since we do not create any basic blocks or
     // instructions in it.
-    *udf = llvm::Function::Create(
-        udf_type, llvm::GlobalValue::ExternalLinkage,
-        fn_.scalar_fn.symbol, codegen->module());
+    *udf = prototype.GeneratePrototype(NULL, NULL, false);
 
-    // Associate the dynamically loaded function pointer with the Function* we
-    // defined. This tells LLVM where the compiled function definition is located in
-    // memory.
+    // Associate the dynamically loaded function pointer with the Function* we defined.
+    // This tells LLVM where the compiled function definition is located in memory.
     codegen->execution_engine()->addGlobalMapping(*udf, fn_ptr);
   } else if (fn_.binary_type == TFunctionBinaryType::BUILTIN) {
     // In this path, we're running a builtin with the UDF interface. The IR is
     // in the llvm module.
     DCHECK(state->codegen_enabled());
-    *udf = codegen->module()->getFunction(fn_.scalar_fn.symbol);
+    *udf = codegen->GetFunction(fn_.scalar_fn.symbol);
     if (*udf == NULL) {
       // Builtins symbols should exist unless there is a version mismatch.
       stringstream ss;
@@ -490,7 +497,7 @@ Status ScalarFnCall::GetUdf(RuntimeState* state, llvm::Function** udf)
{
   } else {
     // We're running an IR UDF.
     DCHECK_EQ(fn_.binary_type, TFunctionBinaryType::IR);
-    *udf = codegen->module()->getFunction(fn_.scalar_fn.symbol);
+    *udf = codegen->GetFunction(fn_.scalar_fn.symbol);
     if (*udf == NULL) {
       stringstream ss;
       ss << "Unable to locate function " << fn_.scalar_fn.symbol
@@ -515,7 +522,7 @@ Status ScalarFnCall::GetFunction(RuntimeState* state, const string&
symbol, void
     DCHECK_EQ(fn_.binary_type, TFunctionBinaryType::IR);
     LlvmCodeGen* codegen;
     RETURN_IF_ERROR(state->GetCodegen(&codegen));
-    llvm::Function* ir_fn = codegen->module()->getFunction(symbol);
+    llvm::Function* ir_fn = codegen->GetFunction(symbol);
     if (ir_fn == NULL) {
       stringstream ss;
       ss << "Unable to locate function " << symbol

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/runtime/buffered-tuple-stream.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/buffered-tuple-stream.cc b/be/src/runtime/buffered-tuple-stream.cc
index 96e2d0c..050134c 100644
--- a/be/src/runtime/buffered-tuple-stream.cc
+++ b/be/src/runtime/buffered-tuple-stream.cc
@@ -855,3 +855,33 @@ bool BufferedTupleStream::CopyCollections(const Tuple* tuple,
   }
   return true;
 }
+
+void BufferedTupleStream::GetTupleRow(const RowIdx& idx, TupleRow* row) const {
+  DCHECK(row != NULL);
+  DCHECK(!closed_);
+  DCHECK(is_pinned());
+  DCHECK(!delete_on_read_);
+  DCHECK_EQ(blocks_.size(), block_start_idx_.size());
+  DCHECK_LT(idx.block(), blocks_.size());
+
+  uint8_t* data = block_start_idx_[idx.block()] + idx.offset();
+  if (has_nullable_tuple_) {
+    // Stitch together the tuples from the block and the NULL ones.
+    const int tuples_per_row = desc_.tuple_descriptors().size();
+    uint32_t tuple_idx = idx.idx() * tuples_per_row;
+    for (int i = 0; i < tuples_per_row; ++i) {
+      const uint8_t* null_word = block_start_idx_[idx.block()] + (tuple_idx >> 3);
+      const uint32_t null_pos = tuple_idx & 7;
+      const bool is_not_null = ((*null_word & (1 << (7 - null_pos))) == 0);
+      row->SetTuple(i, reinterpret_cast<Tuple*>(
+          reinterpret_cast<uint64_t>(data) * is_not_null));
+      data += desc_.tuple_descriptors()[i]->byte_size() * is_not_null;
+      ++tuple_idx;
+    }
+  } else {
+    for (int i = 0; i < desc_.tuple_descriptors().size(); ++i) {
+      row->SetTuple(i, reinterpret_cast<Tuple*>(data));
+      data += desc_.tuple_descriptors()[i]->byte_size();
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/runtime/buffered-tuple-stream.inline.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/buffered-tuple-stream.inline.h b/be/src/runtime/buffered-tuple-stream.inline.h
index 3a170c4..d7c72cd 100644
--- a/be/src/runtime/buffered-tuple-stream.inline.h
+++ b/be/src/runtime/buffered-tuple-stream.inline.h
@@ -55,36 +55,6 @@ inline uint8_t* BufferedTupleStream::AllocateRow(int fixed_size, int varlen_size
   return fixed_data;
 }
 
-inline void BufferedTupleStream::GetTupleRow(const RowIdx& idx, TupleRow* row) const
{
-  DCHECK(row != NULL);
-  DCHECK(!closed_);
-  DCHECK(is_pinned());
-  DCHECK(!delete_on_read_);
-  DCHECK_EQ(blocks_.size(), block_start_idx_.size());
-  DCHECK_LT(idx.block(), blocks_.size());
-
-  uint8_t* data = block_start_idx_[idx.block()] + idx.offset();
-  if (has_nullable_tuple_) {
-    // Stitch together the tuples from the block and the NULL ones.
-    const int tuples_per_row = desc_.tuple_descriptors().size();
-    uint32_t tuple_idx = idx.idx() * tuples_per_row;
-    for (int i = 0; i < tuples_per_row; ++i) {
-      const uint8_t* null_word = block_start_idx_[idx.block()] + (tuple_idx >> 3);
-      const uint32_t null_pos = tuple_idx & 7;
-      const bool is_not_null = ((*null_word & (1 << (7 - null_pos))) == 0);
-      row->SetTuple(i, reinterpret_cast<Tuple*>(
-          reinterpret_cast<uint64_t>(data) * is_not_null));
-      data += desc_.tuple_descriptors()[i]->byte_size() * is_not_null;
-      ++tuple_idx;
-    }
-  } else {
-    for (int i = 0; i < desc_.tuple_descriptors().size(); ++i) {
-      row->SetTuple(i, reinterpret_cast<Tuple*>(data));
-      data += desc_.tuple_descriptors()[i]->byte_size();
-    }
-  }
-}
-
 }
 
 #endif

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/testutil/test-udfs.cc
----------------------------------------------------------------------
diff --git a/be/src/testutil/test-udfs.cc b/be/src/testutil/test-udfs.cc
index cf2a5e6..fc57b21 100644
--- a/be/src/testutil/test-udfs.cc
+++ b/be/src/testutil/test-udfs.cc
@@ -123,7 +123,7 @@ DecimalVal VarSum(FunctionContext* context, int n, const DecimalVal* args)
{
   return DecimalVal(result);
 }
 
-DoubleVal VarSumMultiply(FunctionContext* context,
+DoubleVal __attribute__((noinline)) VarSumMultiply(FunctionContext* context,
     const DoubleVal& d, int n, const IntVal* args) {
   if (d.is_null) return DoubleVal::null();
 
@@ -138,6 +138,23 @@ DoubleVal VarSumMultiply(FunctionContext* context,
   return DoubleVal(result * d.val);
 }
 
+// Call the non-inlined function in the same module to make sure linking works correctly.
+DoubleVal VarSumMultiply2(FunctionContext* context,
+    const DoubleVal& d, int n, const IntVal* args) {
+  return VarSumMultiply(context, d, n, args);
+}
+
+// Call a function defined in Impalad proper to make sure linking works correctly.
+extern "C" StringVal
+    _ZN6impala15StringFunctions5LowerEPN10impala_udf15FunctionContextERKNS1_9StringValE(
+        FunctionContext* context, const StringVal& str);
+
+StringVal ToLower(FunctionContext* context, const StringVal& str) {
+  return
+      _ZN6impala15StringFunctions5LowerEPN10impala_udf15FunctionContextERKNS1_9StringValE(
+          context, str);
+}
+
 BooleanVal TestError(FunctionContext* context) {
   context->SetError("test UDF error");
   context->SetError("this shouldn't show up");

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/be/src/util/symbols-util-test.cc
----------------------------------------------------------------------
diff --git a/be/src/util/symbols-util-test.cc b/be/src/util/symbols-util-test.cc
index 9a24fd1..ffe6bf0 100644
--- a/be/src/util/symbols-util-test.cc
+++ b/be/src/util/symbols-util-test.cc
@@ -18,6 +18,7 @@
 #include <gtest/gtest.h>
 
 #include "codegen/llvm-codegen.h"
+#include "util/cpu-info.h"
 #include "util/symbols-util.h"
 
 #include "common/names.h"
@@ -324,6 +325,7 @@ TEST(SymbolsUtil, ManglingPrepareOrClose) {
 
 int main(int argc, char **argv) {
   ::testing::InitGoogleTest(&argc, argv);
+  impala::CpuInfo::Init();
   impala::LlvmCodeGen::InitializeLlvm();
   return RUN_ALL_TESTS();
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/testdata/workloads/functional-query/queries/QueryTest/udf.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-query/queries/QueryTest/udf.test b/testdata/workloads/functional-query/queries/QueryTest/udf.test
index 3c930a8..645c321 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/udf.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/udf.test
@@ -388,6 +388,20 @@ double
 105
 ====
 ---- QUERY
+select var_sum_multiply2(5.0, 1, 2, 3, 4, 5, 6)
+---- TYPES
+double
+---- RESULTS
+105
+====
+---- QUERY
+select to_lower("HELLO")
+---- TYPES
+string
+---- RESULTS
+'hello'
+====
+---- QUERY
 select tinyint_col, int_col, var_sum_multiply(2, tinyint_col, int_col)
 from functional.alltypestiny
 ---- TYPES

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/276376ac/tests/query_test/test_udfs.py
----------------------------------------------------------------------
diff --git a/tests/query_test/test_udfs.py b/tests/query_test/test_udfs.py
index e64a9e0..4013ab8 100644
--- a/tests/query_test/test_udfs.py
+++ b/tests/query_test/test_udfs.py
@@ -341,6 +341,8 @@ drop function if exists {database}.var_sum(double...);
 drop function if exists {database}.var_sum(string...);
 drop function if exists {database}.var_sum(decimal(4,2)...);
 drop function if exists {database}.var_sum_multiply(double, int...);
+drop function if exists {database}.var_sum_multiply2(double, int...);
+drop function if exists {database}.to_lower(string);
 drop function if exists {database}.constant_timestamp();
 drop function if exists {database}.validate_arg_type(string);
 drop function if exists {database}.count_rows();
@@ -426,6 +428,14 @@ create function {database}.var_sum_multiply(double, int...) returns double
 location '{location}'
 symbol='_Z14VarSumMultiplyPN10impala_udf15FunctionContextERKNS_9DoubleValEiPKNS_6IntValE';
 
+create function {database}.var_sum_multiply2(double, int...) returns double
+location '{location}'
+symbol='_Z15VarSumMultiply2PN10impala_udf15FunctionContextERKNS_9DoubleValEiPKNS_6IntValE';
+
+create function {database}.to_lower(string) returns string
+location '{location}'
+symbol='_Z7ToLowerPN10impala_udf15FunctionContextERKNS_9StringValE';
+
 create function {database}.constant_timestamp() returns timestamp
 location '{location}' symbol='ConstantTimestamp';
 



Mime
View raw message