impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
Date Thu, 13 Oct 2016 00:47:36 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
......................................................................


Patch Set 2:

(13 comments)

Need to spend more time with the meat of this patch.

http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

Line 1064:     // To minimize branching on the UpdateTuple path, initialize the result value
The details of how this saves branches was somewhat nebulous to me. Seems worth mentioning
that we do this trick here because later in codegen we want to ignore the dst slot's null
indicator to get rid of a branch, although the dst slot is nullable.

When reading this I was confused about where these UpdateTuple() and Update() functions come
from. Do you mean the UDA Update() function?


Line 1597:     Value* expr_ctx_ptr = builder.CreateGEP(
inbounds?


Line 1658:     // properties. First, for these builtins, null input values can be skipped.
Second,
NULL


http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

Line 654:   void CodegenBranchIfNull(LlvmCodeGen* codegen, LlvmBuilder* builder,
Does it make sense to put this into LlvmCodegen? Seems generally useful.


http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 630: static inline StringVal ALWAYS_INLINE DefaultStringConcatDelim() {
typo: not


http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 580:       builder->CreateGEP(tuple_bytes, null_byte_offset, "null_byte_ptr");
inbounds? (and elsewhere)


Line 619:       result = builder->CreateOr(null_byte, mask, "null_bit_cleared");
null_bit_set


Line 624:   } else {
Why would we not require is_null to be a ConstantInt? Do we exercise the code in the else
block anywhere?


http://gerrit.cloudera.org:8080/#/c/4655/2/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

Line 139:   /// represented as a 1-bit integer indicating whether this slot is null in 'tuple'.
1-byte integer?


Line 144:   /// NULL bit in the given 'tuple' to the 'value' in null. 'tuple' is a pointer
typo: value 'is_null'


Line 145:   /// to the tuple, and 'null' is an boolean value represented as a 1-bit integer.
some typos in this sentence, should be 'is_null'

1-byte integer?


http://gerrit.cloudera.org:8080/#/c/4655/2/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

Line 31: agg_functions = ['sum', 'count', 'min', 'max', 'avg']
add ndv if easy enough? alternatively file a JIRA for adding more coverage (and the extended
coverage in my comments below)


Line 33: data_types = ['int', 'bool', 'double', 'bigint', 'tinyint',
maybe you can squeeze string into this patch if it's easy enough

we should leave decimal for later


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message