impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4123: Fast bit unpacking
Date Wed, 05 Oct 2016 23:07:40 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
......................................................................


Patch Set 2:

(33 comments)

http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

Line 296:         reader.Reset(p->data, p->data_len);
> Should offset advance in this loop, then?
The code was a bit obfuscated. The loop only really retries once, so there's no need to advance
offset. Changed it to be more obvious. Offset is advanced once per value read by virtue of
the offset calculation above.


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

Line 270: #include "common/names.h"
> Why the blank line here?
We separate out common/names.h like this most places because it needs to come after all the
other includes.


PS3, Line 277: 
> Maybe NUM_OUT_VALUES
Done


PS3, Line 305: * p
> const uint8_t * const data_end ...
Done


Line 334:     for (int i = 0; i < data_len; ++i) {
> std::iota
Done


PS3, Line 336: 
> I think you can even elide the parens
Done


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

PS3, Line 29: compute_mask
> ComputeMask, and in an anonymous namespace
Done


PS3, Line 33: UnpackSubset
> With a comment here, if I'm going to use it in PackUnpack.
Done


PS3, Line 37: with memory that is
> this phrase can be omitted
Done


PS3, Line 39: int
> unsigned? DCHECK <= something?
I don't think there's a particular limit on the number of values we could use here.


PS3, Line 39: num_values
> num_in_values?
Done


PS3, Line 39: int
> also unsigned?
I think int should be ok - I normally prefer using signed types by default even for variables
that logically should always be positive because it's easier to reason about the number going
negative compared with underflowing (also easier to catch the out-of-range value with dchecks).


PS3, Line 45: int
> const, here and elsewhere
Done


PS3, Line 51: & compute_mask(bit_width)
> AKA mask; already hoisted
Done


PS3, Line 58: becase
> because
Done


PS3, Line 58: the input buffer size
> "the input buffer size (num_values)"
Done


PS3, Line 60:     // Size buffer exactly so that ASAN can detect buffer overruns.
> I think manual canaray checks might be called for here
We rely on ASAN to detect read overruns anyway (which I was more concerned about since they're
tricky to avoid in the code), so it doesn't seem worth it to me on balance to add the extra
test code.


PS3, Line 65: ASSERT_EQ
> Can you add more " << error message " to your ASSERTs?
Done


http://gerrit.cloudera.org:8080/#/c/4494/2/be/src/util/bit-packing.h
File be/src/util/bit-packing.h:

Line 36:   static const std::pair<const uint8_t*, int64_t> UnpackValues(int bit_width,
> Yeah, the verbosity is frustrating. However, cstdint doesn't necessarily pt
It seems like most implementations of cstdint put the typedefs in the global namespace anyway,
so I think we're ok for now. If that turns out to be a problem I think we should create a
wrapper header with the using declarations.


PS2, Line 57: Unpack32Values
> So, this boundary condition happens for any multiple of 8, and you chose 32
A batch size of 32 is the smallest that works for all supported bit widths. Otherwise you
have to choose different batch sizes for different bit widths (e.g 32 for bit width 31, 8
for bit width 3, etc), which gets more complicated.

Lemire's bitpacking libraries use 32 everywhere (https://github.com/lemire/) so it seemed
like a reasonable choice - he obviously put some effort into validating the performance.

I could benchmark different numbers but I think at this point I'd be optimising for the microbenchmark
rather than a real workload.


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.h
File be/src/util/bit-packing.h:

PS3, Line 55: IT_WIDTH from 'in' to 'out'.
> But the last byte of 'in' that was read might only have been partially used
That's correct. I deliberately made the decision that these functions shouldn't deal with
resuming reads in the middle of bytes, because it makes things way more complicated (and probably
slower). I expect that callers will either read in batches or do their own buffering if they
need to read a value at a time.

Expanded on the comment to make this explicit and explain what the caller can do.


http://gerrit.cloudera.org:8080/#/c/4494/3/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

PS3, Line 56: >(in,
> can be omitted if the branches are swapped
Done


Line 56:     case 21: return UnpackValues<OutType, 21>(in, in_bytes, num_values, out);
> long line
Done


PS3, Line 90: 
            :   if (r
> I think you accidentally a word
I ended up retrying the template function thing and it works ok this time around. I think
something else was confounding it before (perhaps using #pragma unroll instead of manually
unrolling the loop).


PS3, Line 93: t
> If 'i' is a compile-time constant, maybe call it "I" or "NTH"?
It's now a template arg, so renamed too.


PS3, Line 93: _values, 
> I think "LOAD_TYPE" could be "LoadType" to more clearly match the lexical s
Done


PS3, Line 96: 
> I think constexprs should get the static const treatment of all caps
Done


PS3, Line 106: 
> Can you add a comment describing what this is for?
renamed to lower_bits, since it seemed like a better name after writing the comment.


PS3, Line 108: me
> How is this 32 derived?
Mainly because 32-bit values are the largest we support. This changed a bit with the switch
to a template. Now we just return LoadType, then the caller can truncate it.


PS3, Line 109: as possible an
> so, this case al the bits we need are in one word?
Yes, good point, added that to the comment.


PS3, Line 119: expr int first_bit_offset = first_bit % load_bit_wid
> Maybe give it an unsigned type?
I changed the types to unsigned throughout this function since that seems more appropriate.
However, here the calculation just underflows and produces a different spurious warning, so
I still need the ternary operator.


PS3, Line 124:  LOAD_TYPE shifted = BIT_WIDTH > 0 ?                                   
> this could use a bit more explanation
It's now the same problem as the above one, so the comment is the same now.


PS3, Line 143:      c
> Check not already defined?
I'm not quite sure that I understand the intent or how I'd do that, aside from something verbose
like:

#ifdef UNPACK_VALUE_CALL
#error "Shouldn't be defined"
#endif


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message