impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4123: Fast bit unpacking
Date Tue, 04 Oct 2016 21:33:09 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking

Patch Set 2:

Commit Message:

PS2, Line 12: was
> "want"

PS2, Line 18: possible
> long line
File be/src/benchmarks/

PS2, Line 259: include
> I think cmath, cstdlib, cstdio are the non-deprecated versions for C++

Line 260: #include <stdlib.h>
> Can you arrange these in three blocks:

Line 276: #define NUM_VALUES 1024 * 1024
> constexpr int

PS2, Line 281: BenchmarkParams
> You can leave the constructor out and use aggregate initialization:
Done. I didn't realise you could use this with new

Line 290:   BenchmarkParams* p = reinterpret_cast<BenchmarkParams*>(data);
> maybe const auto p = reinterpret_cast<const BenchmarkParams *>
I added the const. It doesn't seem verbose enough to require auto.

PS2, Line 294: int64_t
> const

PS2, Line 294: i * 32 % NUM_VALUES + j;
> Shouldn't this while thing be modulo NUM_VALUES, not just the non-+j part?

Line 296:         reader.Reset(p->data, p->data_len);
> Why Reset?
We need to do this to start the reader reading from the start of the buffer again.

PS2, Line 333:     uint8_t* data = new uint8_t[data_len];
> unique_ptr to clean up after
Changed to a vector

PS2, Line 334: int
> int64_t

PS2, Line 337: BenchmarkParams
> This can be a local var, right?
File be/src/util/bit-packing.h:

Line 18: #include <stdint.h>
> cstdint

PS2, Line 29: Unpack bit-packed values
> What does this phrase mean?
I added an explanation to the class comment, probably needed to make the rest of the code

PS2, Line 30: outputted
> unpacked

PS2, Line 30: is
> are

PS2, Line 30: 'out' must have
            :   /// enough space for 'num_values'.
> And in must point to at least in_bytes of addressable memory?

PS2, Line 31: 0
> What does a zero bit_width mean in this context?
Added an explanation to the class comment. It means every value is zero.

PS2, Line 33: plus
> "And also"

PS2, Line 32: utType.
            :   /// Retu
> extra line break here, please

PS2, Line 33: the byte 
> a pointer to the byte

PS2, Line 33: the number of
            :   /// values that were read.
> I can infer that by just subtracting, right? I think the pair might be over
It's the number of logical values that were read. The calculation is a function of the input
arguments: min(num_values, in_bytes / bit_width) but it seems like the kind of thing that
is easy to get wrong - that was the reason for making in_bytes a parameter rather than letting
the caller do the calculation.

Also division by bit_width is slower here than inside the function where we're dispatching
to bit_width-specialized code anyway.

PS2, Line 36: const
> This const has no effect, I believe.

Line 36:   static const std::pair<const uint8_t*, int64_t> UnpackValues(int bit_width,
> std::uint8_t, etc.
We don't do this anywhere else, seems unnecessarily verbose.

PS2, Line 57: Unpack32Values
> Why not up to 64?
Added some explanation to the class comment for the magic number. It works out because for
32-bit packed values, the end of the packed batch always falls on a byte boundary, so there's
no need to deal with partial bytes between batches.

If we went to larger batches I think the additional benefit from amortising loop overheads
is fairly minimal and the code will get a lot bigger.
File be/src/util/bit-packing.inline.h:

Line 35:     case 0: return UnpackValues<OutType, 0>(in, in_bytes, num_values, out);
> BOOST_PP_REPEAT_FROM_TO would be worth it here, I think
Done here and elsewhere. I'm a little concerned that it makes it hard to debug any compile
errors, but hopefully the code shouldn't change much.

PS2, Line 70: <const uint8_t*, int64_t>
> can be omitted
In this case it can't be inferred from the arguments.

PS2, Line 70: NULL
> nullptr
Done. Have we actually standardised on this?

PS2, Line 78: DCHECK_GE
> static_assert
Done here and elsewhere. This seems to have made the debug build a lot faster, which is great.

PS2, Line 79: 8
> CHAR_BIT, here and below

PS2, Line 80: in_bytes * 8 / BIT_WIDTH
> I think this line would be clearer with more parens

PS2, Line 99:  unsigned integer type
> static_check with

PS2, Line 108: inlined as
             : // soon as possible and subject to all optimizations
> Can you elaborate on this? Are there benchmarks on that?
Reworded this. I did some experiments early on looking at the assembly and wasn't able to
convince gcc to generate the optimal code without resorting to the macro approach.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message