impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3674: Lazy materialization of LLVM module bitcode.
Date Fri, 08 Jul 2016 20:40:30 GMT
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3220

to look at the new patch set (#8).

Change subject: IMPALA-3674: Lazy materialization of LLVM module bitcode.
......................................................................

IMPALA-3674: Lazy materialization of LLVM module bitcode.

Previously, each fragment using dynamic code generation will
parse the bitcode module and populate the LLVM data structures
for all the functions and their bodies in the bitcode module.
This is wasteful as we may only use a few functions out of all
the functions parsed. We rely on dead code elimination to
delete most of the unused functions so we won't waste time
compiling them.

This change implements lazy materialization of the functions'
bodies. On the initial parse of the bitcode module, we just
create the Function objects for each function in the module.
The functions' bodies will be materialized on demand from the
bitcode module when they are actually referenced in the query.
This ensures that the prepare time during codegen is proportional
to the number of IR functions referenced by the query instead
of being proportional to the total number of IR functions in
the module.

This change also stops cross-compiling BufferedTupleStream::GetTupleRow()
as there isn't much benefit for doing it. In addition, move the ctors
and dtors of LikePredicate to the header file to avoid an unnecessary
alias in the IR module.

For TPCH-Q2, a fragment which only codegen 9 functions used to spend
146ms in codegen. It now goes down to 35ms, a 76% reduction.

      CodeGen:(Total: 146.041ms, non-child: 146.041ms, % non-child: 100.00%)
         - CodegenTime: 0.000ns
         - CompileTime: 2.003ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 2.12 MB (2225304)
         - NumFunctions: 9 (9)
         - NumInstructions: 129 (129)
         - OptimizationTime: 29.019ms
         - PrepareTime: 114.651ms

      CodeGen:(Total: 35.288ms, non-child: 35.288ms, % non-child: 100.00%)
         - CodegenTime: 0.000ns
         - CompileTime: 1.880ms
         - LoadTime: 0.000ns
         - ModuleBitcodeSize: 2.12 MB (2221276)
         - NumFunctions: 9 (9)
         - NumInstructions: 129 (129)
         - OptimizationTime: 5.101ms
         - PrepareTime: 28.044ms

A single node run of TPCH(15) also shows improvement in some short-running queries:

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(15) | parquet / none / none | 6.78    | -0.81%     | 4.52       | -3.66%         |
+----------+-----------------------+---------+------------+------------+----------------+

+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
| Workload | Query    | File Format           | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)
 | Base StdDev(%) | Num Clients | Iters |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+
| TPCH(15) | TPCH-Q18 | parquet / none / none | 10.49  | 9.89        |   +6.13%   |   1.09%
   |   1.71%        | 1           | 10    |
| TPCH(15) | TPCH-Q7  | parquet / none / none | 11.57  | 11.23       |   +3.02%   |   2.32%
   |   1.23%        | 1           | 10    |
| TPCH(15) | TPCH-Q12 | parquet / none / none | 3.25   | 3.20        |   +1.62%   |   2.84%
   |   2.36%        | 1           | 10    |
| TPCH(15) | TPCH-Q9  | parquet / none / none | 9.81   | 9.70        |   +1.09%   |   1.17%
   |   1.01%        | 1           | 10    |
| TPCH(15) | TPCH-Q17 | parquet / none / none | 11.17  | 11.15       |   +0.20%   |   9.31%
   |   9.45%        | 1           | 10    |
| TPCH(15) | TPCH-Q21 | parquet / none / none | 17.02  | 17.04       |   -0.08%   |   1.77%
   |   2.13%        | 1           | 10    |
| TPCH(15) | TPCH-Q15 | parquet / none / none | 3.73   | 3.73        |   -0.16%   |   3.43%
   |   2.54%        | 1           | 10    |
| TPCH(15) | TPCH-Q19 | parquet / none / none | 34.33  | 34.39       |   -0.18%   |   2.23%
   |   2.60%        | 1           | 10    |
| TPCH(15) | TPCH-Q13 | parquet / none / none | 7.26   | 7.31        |   -0.67%   |   1.53%
   |   2.65%        | 1           | 10    |
| TPCH(15) | TPCH-Q1  | parquet / none / none | 8.62   | 8.71        |   -1.05%   |   1.12%
   |   0.47%        | 1           | 10    |
| TPCH(15) | TPCH-Q4  | parquet / none / none | 2.52   | 2.55        |   -1.30%   |   2.96%
   |   1.63%        | 1           | 10    |
| TPCH(15) | TPCH-Q14 | parquet / none / none | 2.75   | 2.80        |   -1.87%   |   3.23%
   |   3.04%        | 1           | 10    |
| TPCH(15) | TPCH-Q8  | parquet / none / none | 4.68   | 4.84        |   -3.38%   |   2.38%
   |   2.52%        | 1           | 10    |
| TPCH(15) | TPCH-Q5  | parquet / none / none | 3.40   | 3.52        |   -3.55%   |   1.36%
   |   1.13%        | 1           | 10    |
| TPCH(15) | TPCH-Q3  | parquet / none / none | 3.29   | 3.45        |   -4.52%   |   1.37%
   |   1.84%        | 1           | 10    |
| TPCH(15) | TPCH-Q16 | parquet / none / none | 1.61   | 1.72        |   -5.88%   |   2.18%
   |   3.73%        | 1           | 10    |
| TPCH(15) | TPCH-Q10 | parquet / none / none | 4.47   | 4.76        |   -6.14%   |   3.96%
   |   1.98%        | 1           | 10    |
| TPCH(15) | TPCH-Q22 | parquet / none / none | 1.95   | 2.10        |   -6.93%   |   2.41%
   |   1.57%        | 1           | 10    |
| TPCH(15) | TPCH-Q20 | parquet / none / none | 2.82   | 3.08        |   -8.39%   |   2.71%
   |   2.21%        | 1           | 10    |
| TPCH(15) | TPCH-Q11 | parquet / none / none | 1.15   | 1.26        |   -8.81%   |   2.58%
   |   5.01%        | 1           | 10    |
| TPCH(15) | TPCH-Q6  | parquet / none / none | 1.74   | 1.93        |   -9.67%   |   1.66%
   |   2.53%        | 1           | 10    |
| TPCH(15) | TPCH-Q2  | parquet / none / none | 1.53   | 2.03        | I -24.56%  | * 14.70%
* |   2.57%        | 1           | 10    |
+----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------------+-------+

Change-Id: I6ed7862fc5e86005ecea83fa2ceb489e737d66b2
---
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/like-predicate.cc
M be/src/exprs/like-predicate.h
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/testutil/test-udfs.cc
M be/src/util/symbols-util-test.cc
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/query_test/test_udfs.py
14 files changed, 275 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/20/3220/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ed7862fc5e86005ecea83fa2ceb489e737d66b2
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message