Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/6023
)
Change subject: IMPALA4848: Add WIDTH_BUCKET() function
......................................................................
Patch Set 7:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/mathfunctionsir.cc
File be/src/exprs/mathfunctionsir.cc:
http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/mathfunctionsir.cc@479
PS7, Line 479: result.val = num_buckets.val;
I think it's clearer and simpler to write:
result.val = num_buckets.val + 1;
http://gerrit.cloudera.org:8080/#/c/6023/7/be/src/exprs/mathfunctionsir.cc@516
PS7, Line 516: int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
converting to int256 every time is going to be slow. If we care about speed (and I think we
do), it would be a good idea to implement this without having to convert to int256.
One approach that I'm thinking is to construct an array of Decimals with precision and scale
the same as the input expression. Each entry indicates the boundaries of the buckets.
For example, we are calling WidthBucketImpl(... min_range=3, max_range=13, num_buckets=2)
we would construct an array that looks like [3, 8, 13]. This would be a one time expensive
operation that hopefully gets codegened away. For each value, we would have to do a binary
search to find where it fits in the array. (For example, 10 is between 8 and 13, so it goes
to bucket 2). If the array is small enough, we can simply iterate over it.
There may be other better approaches, but it looks to me that converting to int256 (and then
doing int 256 multiplications and divisions) every time even if we are dealing with tiny decimals
is not the right way to go.

To view, visit http://gerrit.cloudera.org:8080/6023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
GerritProject: ImpalaASF
GerritBranch: master
GerritMessageType: comment
GerritChangeId: I081bc916b1bef7b929ca161a9aade3b54c6b858f
GerritChangeNumber: 6023
GerritPatchSet: 7
GerritOwner: anujphadke <aphadke@cloudera.com>
GerritReviewer: Alex Behm <alex.behm@cloudera.com>
GerritReviewer: Michael Brown <mikeb@cloudera.com>
GerritReviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
GerritReviewer: anujphadke <aphadke@cloudera.com>
GerritCommentDate: Mon, 09 Oct 2017 21:38:23 +0000
GerritHasComments: Yes
