impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Apple (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4123: Fast bit unpacking
Date Tue, 04 Oct 2016 23:44:05 GMT
Jim Apple has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking

Patch Set 3:

File be/src/benchmarks/

Line 296:       }
> We need to do this to start the reader reading from the start of the buffer
Should offset advance in this loop, then?
File be/src/benchmarks/

Line 270: 
Why the blank line here?

PS3, Line 277: NUM_VALUES

PS3, Line 305: * d
const uint8_t * const data_end ...

Line 334:       data[i] = static_cast<uint8_t>(i);

PS3, Line 336: (
I think you can even elide the parens
File be/src/util/

PS3, Line 29: compute_mask
ComputeMask, and in an anonymous namespace

PS3, Line 33: UnpackSubset
With a comment here, if I'm going to use it in PackUnpack.

PS3, Line 37: with memory that is
this phrase can be omitted

PS3, Line 39: int
also unsigned?

PS3, Line 39: int
unsigned? DCHECK <= something?

PS3, Line 39: num_values

PS3, Line 45: int
const, here and elsewhere

PS3, Line 51: & compute_mask(bit_width)
AKA mask; already hoisted

PS3, Line 58: becase

PS3, Line 58: the input buffer size
"the input buffer size (num_values)"

PS3, Line 60:     // Size buffer exactly so that ASAN can detect buffer overruns.
I think manual canaray checks might be called for here

PS3, Line 65: ASSERT_EQ
Can you add more " << error message " to your ASSERTs?
File be/src/util/bit-packing.h:

Line 36: /// little-endian order). E.g. the values 1, 2, 3, 4 packed with bit width 4 results
> We don't do this anywhere else, seems unnecessarily verbose.
Yeah, the verbosity is frustrating. However, cstdint doesn't necessarily pt these in the global
namespace, and stdint.h is deprecated. What color do you think we should paint this bikeshed?
Maybe a using declaration at the top?

PS2, Line 57: Type>
> Added some explanation to the class comment for the magic number. It works 
So, this boundary condition happens for any multiple of 8, and you chose 32 to amortize the
loop some more?

Did you do any testing of 16 and 64 to see how they compare?
File be/src/util/bit-packing.h:

PS3, Line 55: after the last byte of 'in' that was read
But the last byte of 'in' that was read might only have been partially used, right?
File be/src/util/bit-packing.inline.h:

PS2, Line 108:              \
             :     if (end_bit_offset < load_bit_width) {          
> Reworded this. I did some experiments early on looking at the assembly and 
Did you try always_inline and template parameters? If yes, what else did you try?
File be/src/util/bit-packing.inline.h:

PS3, Line 56:  == 0
can be omitted if the branches are swapped

Line 56:   const int64_t max_input_values = BIT_WIDTH == 0 ? num_values : (in_bytes * CHAR_BIT)
long line

PS3, Line 90: early
            : // with
I think you accidentally a word

PS3, Line 93: LOAD_TYPE
I think "LOAD_TYPE" could be "LoadType" to more clearly match the lexical standard for other
type names

PS3, Line 93: i
If 'i' is a compile-time constant, maybe call it "I" or "NTH"?

PS3, Line 96: load_bit_width
I think constexprs should get the static const treatment of all caps

PS3, Line 106: shifted
Can you add a comment describing what this is for?

PS3, Line 108: 32
How is this 32 derived?

PS3, Line 109: end_bit_offset
so, this case al the bits we need are in one word?

PS3, Line 119: Make non-negative to avoid spurious compiler warning
Maybe give it an unsigned type?

PS3, Line 124: Special-case impossible BIT_WIDTH 32 to avoid spurious compiler warning
this could use a bit more explanation

PS3, Line 143: define
Check not already defined?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message