kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: RLE encoding limitation
Date Tue, 09 Aug 2016 21:34:02 GMT
Hi James,

Yes, that's correct. Unfortunately right now the RLE encoding code doesn't
support int64.

I'd suggest trying the "BIT_SHUFFLE" encoding, which can get you a lot of
the same benefits as RLE and does work on int64.

Since it seems you're good at checking out the code, I'd also be happy to
review a patch to fix this if you want to give it a try :)


On Tue, Aug 9, 2016 at 2:19 PM, James Pirz <james.pirz@gmail.com> wrote:

> Hi,
> I am trying to use RLE-encoding in Kudu with fixed bit-width values. I am
> specifically dealing with uint64 values which are supposed to be 64-bit
> long. I realized that the code under BitWriter (which is used in flushing
> RLE runs) only handles values up to 32-bit values:
> inline void BitWriter::PutValue(uint64_t v, int num_bits) {
>         // TODO: revisit this limit if necessary (can be raised to 64 by
> fixing some edge cases)
>         DCHECK_LE(num_bits, 32);
> ...
> Can you please verify if indeed such a limitation exists in Kudu for RLE
> encoding (i.e. it can not be applied to fixed size values longer than 32
> bits) and if yes is there a work around for that ?
> (I also checked RLE code in Impala and parquet-cpp which share the RLE
> code and they seem to have the same limitation).
> Thanks

Todd Lipcon
Software Engineer, Cloudera

View raw message