impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4848: Add WIDTH BUCKET() function
Date Mon, 09 Oct 2017 21:38:23 GMT
Taras Bobrovytsky has posted comments on this change. (

Change subject: IMPALA-4848: Add WIDTH_BUCKET() function

Patch Set 7:

File be/src/exprs/
PS7, Line 479:     result.val = num_buckets.val;
I think it's clearer and simpler to write:
result.val = num_buckets.val + 1;
PS7, Line 516:   int256_t x = ConvertToInt256(buckets.value()) * ConvertToInt256(width_size.value());
converting to int256 every time is going to be slow. If we care about speed (and I think we
do), it would be a good idea to implement this without having to convert to int256.

One approach that I'm thinking is to construct an array of Decimals with precision and scale
the same as the input expression. Each entry indicates the boundaries of the buckets.

For example, we are calling WidthBucketImpl(... min_range=3, max_range=13, num_buckets=2)
we would construct an array that looks like [3, 8, 13]. This would be a one time expensive
operation that hopefully gets codegened away. For each value, we would have to do a binary
search to find where it fits in the array. (For example, 10 is between 8 and 13, so it goes
to bucket 2). If the array is small enough, we can simply iterate over it.

There may be other better approaches, but it looks to me that converting to int256 (and then
doing int 256 multiplications and divisions) every time even if we are dealing with tiny decimals
is not the right way to go.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I081bc916b1bef7b929ca161a9aade3b54c6b858f
Gerrit-Change-Number: 6023
Gerrit-PatchSet: 7
Gerrit-Owner: anujphadke <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Michael Brown <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-Reviewer: anujphadke <>
Gerrit-Comment-Date: Mon, 09 Oct 2017 21:38:23 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message