impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Apple (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4123: Fast bit unpacking
Date Tue, 04 Oct 2016 23:44:05 GMT
Jim Apple has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
......................................................................


Patch Set 3:

(34 comments)

http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

Line 296:       }
> We need to do this to start the reader reading from the start of the buffer
Should offset advance in this loop, then?


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

Line 270: 
Why the blank line here?


PS3, Line 277: NUM_VALUES
Maybe NUM_OUT_VALUES


PS3, Line 305: * d
const uint8_t * const data_end ...


Line 334:       data[i] = static_cast<uint8_t>(i);
std::iota


PS3, Line 336: (
I think you can even elide the parens


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

PS3, Line 29: compute_mask
ComputeMask, and in an anonymous namespace


PS3, Line 33: UnpackSubset
With a comment here, if I'm going to use it in PackUnpack.


PS3, Line 37: with memory that is
this phrase can be omitted


PS3, Line 39: int
also unsigned?


PS3, Line 39: int
unsigned? DCHECK <= something?


PS3, Line 39: num_values
num_in_values?


PS3, Line 45: int
const, here and elsewhere


PS3, Line 51: & compute_mask(bit_width)
AKA mask; already hoisted


PS3, Line 58: becase
because


PS3, Line 58: the input buffer size
"the input buffer size (num_values)"


PS3, Line 60:     // Size buffer exactly so that ASAN can detect buffer overruns.
I think manual canaray checks might be called for here


PS3, Line 65: ASSERT_EQ
Can you add more " << error message " to your ASSERTs?


http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.h
File be/src/util/bit-packing.h:

Line 36: /// little-endian order). E.g. the values 1, 2, 3, 4 packed with bit width 4 results
> We don't do this anywhere else, seems unnecessarily verbose.
Yeah, the verbosity is frustrating. However, cstdint doesn't necessarily pt these in the global
namespace, and stdint.h is deprecated. What color do you think we should paint this bikeshed?
Maybe a using declaration at the top?


PS2, Line 57: Type>
> Added some explanation to the class comment for the magic number. It works 
So, this boundary condition happens for any multiple of 8, and you chose 32 to amortize the
loop some more?

Did you do any testing of 16 and 64 to see how they compare?


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.h
File be/src/util/bit-packing.h:

PS3, Line 55: after the last byte of 'in' that was read
But the last byte of 'in' that was read might only have been partially used, right?


http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

PS2, Line 108:              \
             :     if (end_bit_offset < load_bit_width) {          
> Reworded this. I did some experiments early on looking at the assembly and 
Did you try always_inline and template parameters? If yes, what else did you try?


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

PS3, Line 56:  == 0
can be omitted if the branches are swapped


Line 56:   const int64_t max_input_values = BIT_WIDTH == 0 ? num_values : (in_bytes * CHAR_BIT)
/ BIT_WIDTH;
long line


PS3, Line 90: early
            : // with
I think you accidentally a word


PS3, Line 93: LOAD_TYPE
I think "LOAD_TYPE" could be "LoadType" to more clearly match the lexical standard for other
type names


PS3, Line 93: i
If 'i' is a compile-time constant, maybe call it "I" or "NTH"?


PS3, Line 96: load_bit_width
I think constexprs should get the static const treatment of all caps


PS3, Line 106: shifted
Can you add a comment describing what this is for?


PS3, Line 108: 32
How is this 32 derived?


PS3, Line 109: end_bit_offset
so, this case al the bits we need are in one word?


PS3, Line 119: Make non-negative to avoid spurious compiler warning
Maybe give it an unsigned type?


PS3, Line 124: Special-case impossible BIT_WIDTH 32 to avoid spurious compiler warning
this could use a bit more explanation


PS3, Line 143: define
Check not already defined?


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message