impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@apache.org>
Subject Fwd: [kudu-CR] codegen: fix regression in inlining after LLVM 4.0 upgrade
Date Wed, 07 Jun 2017 00:15:35 GMT
FYI Impala folks: you might hit this issue when you upgrade to llvm 4.0 as
well. Something to be aware of
---------- Forwarded message ----------
From: "Todd Lipcon (Code Review)" <gerrit@cloudera.org>
Date: Jun 6, 2017 5:06 PM
Subject: [kudu-CR] codegen: fix regression in inlining after LLVM 4.0
upgrade
To: "Dan Burkert" <danburkert@apache.org>, "David Ribeiro Alves" <
davidralves@gmail.com>, "Binglin Chang" <decstery@gmail.com>, <
reviews@kudu.incubator.apache.org>
Cc: "Todd Lipcon" <todd@apache.org>

Hello Dan Burkert,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/7099

to review the following change.

Change subject: codegen: fix regression in inlining after LLVM 4.0 upgrade
......................................................................

codegen: fix regression in inlining after LLVM 4.0 upgrade

With the upgrade to LLVM 4.0, the performance of our generated code
regressed significantly. After looking at the generated assembly from
codegen-test, I could see that there were many 'call' instructions that
didn't appear in the earlier output, which led me to suspect the
inliner.

After adding a 'module->dump()' call and looking at the output, I could
see that the calls were to utility functions like BitmapTest() which
should be inlined due to be very small. However, the function was marked
'noinline' in precompiled.ll. After a bit of Googling I came across a
thread[1] in which someone else noticed that all of their functions had
unexpected 'noinline' attributes after upgrading to 4.0.

After following the breadcrumbs, I found some advice to change the way
in which we emit precompiled.ll to disable the built-in LLVM passes
which were responsible for adding this attribute.

During my investigation, I also found that we should have been passing
the optimization and size levels into the inlining pass, so I threw
that change in for good measure.

This fixed the perf regression. I benchmarked using:
  memrowset-test --roundtrip_num_rows=10000000 \
                 --gtest_filter=\*InsertCount\* \
                 --gtest_repeat=10

Before this fix:

  I0606 16:55:51.448117 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.502s        user 0.502s     sys 0.000s
  I0606 16:55:57.937391 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.500s        user 0.499s     sys 0.000s
  I0606 16:56:04.330037 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.498s        user 0.499s     sys 0.000s
  I0606 16:56:10.706786 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.504s        user 0.503s     sys 0.001s
  I0606 16:56:17.094764 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.510s        user 0.510s     sys 0.000s
  I0606 16:56:23.474200 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.502s        user 0.502s     sys 0.000s
  I0606 16:56:29.845799 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.503s        user 0.503s     sys 0.000s
  I0606 16:56:36.221843 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.502s        user 0.503s     sys 0.000s
  I0606 16:56:42.599604 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.502s        user 0.503s     sys 0.000s
  I0606 16:56:48.962482 48991 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.503s        user 0.502s     sys 0.000s

After this fix:

  I0606 16:53:51.573541 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.302s   user 0.302s     sys 0.000s
  I0606 16:53:57.922485 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.301s   user 0.301s     sys 0.000s
  I0606 16:54:04.186511 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.310s   user 0.310s     sys 0.000s
  I0606 16:54:10.425207 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.313s   user 0.313s     sys 0.000s
  I0606 16:54:16.648202 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.316s   user 0.316s     sys 0.000s
  I0606 16:54:22.868655 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.311s   user 0.312s     sys 0.000s
  I0606 16:54:29.081784 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.313s   user 0.314s     sys 0.000s
  I0606 16:54:35.317623 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.314s   user 0.312s     sys 0.003s
  I0606 16:54:41.550542 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.316s   user 0.315s     sys 0.000s
  I0606 16:54:47.761406 48403 memrowset-test.cc:446] Time spent Scanning
rows where all are committed: real 0.314s   user 0.314s     sys 0.000s

I also manually inspected the assembly from codegen-test and compared vs
a version compiled with LLVM 3.9 and found that it now has fewer
instructions than before.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/113175.html

Change-Id: I7e449df80e5cd02b9ce2dbf4daa5cf8151007dcf
---
M src/kudu/codegen/CMakeLists.txt
M src/kudu/codegen/module_builder.cc
2 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/7099/1
--
To view, visit http://gerrit.cloudera.org:8080/7099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e449df80e5cd02b9ce2dbf4daa5cf8151007dcf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <todd@apache.org>
Gerrit-Reviewer: Dan Burkert <danburkert@apache.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message