impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5573: Add decimal codegen in text scanner
Date Wed, 16 Aug 2017 18:52:00 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5573: Add decimal codegen in text scanner
......................................................................


Patch Set 3:

(8 comments)

Code change looks good, just a few cleanup requests.

http://gerrit.cloudera.org:8080/#/c/7683/2//COMMIT_MSG
Commit Message:

Line 14: StringToDecimal out of LLVM IR.
Can you add a line break to separate the paragraphs to make it more readable.


PS2, Line 15: libUtil.so
nit: libUtil, since we use static linking too.


Line 19: 
Did you do an experiment to see how much scanning sped up for a larger table? tpch.lineitem
has a few decimal columns. I often copy the data a few times for scanner tests like this to
get good measurements.

  create table biglineitem as select * from lineitem;
  insert into biglineitem select * from lineitem;
  insert into biglineitem select * from lineitem;

I think you'll see the biggest speedup if you have a condition in the where clause, because
codegen of that condition also gets disabled. E.g.

  select l_quantity, l_extendedprice, l_discount, l_tax from biglineitem where l_quantity
> 100.0;


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 134: 
> I'm sorry it works without this. I have no idea how I came to that conclusi
Good to know :) I was worried something subtle was going on. There are some places in the
code where we resort to hacks to force modules to be linked into the output binary: https://github.com/apache/incubator-impala/blob/294d42adc117046f975665834af03ddaa53ec27e/be/src/exprs/scalar-expr-evaluator.cc#L428

Once the functions are linked in LLVM by default falls back to resolving functions in the
binary's dynamic symbol table.


Line 134: 
> Without this it crashes with "Program used external function ... which coul
Maybe when you rebuilt Impala you hadn't updated the CMakeLists.txt, or the change hadn't
taken affect. That would explain this.


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/exec/hdfs-scanner-ir.cc
File be/src/exec/hdfs-scanner-ir.cc:

Line 131:   if (*result != ParseResult::PARSE_SUCCESS) *result = ParseResult::PARSE_FAILURE;
> The IR generated by Clang still reads decimalblah and we will need a bitcas
Ah ok, makes sense.

I think it would also be ok to return Decimal*Value and bitcast unconditionally, but this
works for me.


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

Line 234:           case 16:
> Done. This is copied from TextConverter::WriteSlot and there is an unnecess
Oh interesting :) Not sure what that was there.


http://gerrit.cloudera.org:8080/#/c/7683/3/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

PS3, Line 292: proper_slot
maybe "cast_slot". It's unclear what "proper" means.


-- 
To view, visit http://gerrit.cloudera.org:8080/7683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia65820e969d59094dc92d912a5279fa90f6b179d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message