incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Choosing a Partitioner Type for Random java.util.UUID Row Keys
Date Tue, 20 Dec 2011 19:08:16 GMT
	Have you considered using CompositeColumns and a standard CF? Row key is the UUID column
name is (timestamp : dir_entry) you can then slice all columns with a particular time stamp.

	Even if you have a random key, I would use the RP unless you have an extreme use case. 


Aaron Morton
Freelance Developer

On 21/12/2011, at 3:06 AM, Bryce Allen wrote:

> I think it comes down to how much you benefit from row range scans, and
> how confident you are that going forward all data will continue to use
> random row keys.
> I'm considering using BOP as a way of working around the non indexes
> super column limitation. In my current schema, row keys are random
> UUIDs, super column names are timestamps, and columns contain a
> snapshot in time of directory contents, and could be quite large. If
> instead I use row keys that are (uuid)-(timestamp), and use a standard
> column family, I can do a row range query and select only specific
> columns. I'm still evaluating if I can do this with BOP - ideally the
> token would just use the first 128 bits of the key, and I haven't found
> any documentation on how it compares keys of different length.
> Another trick with BOP is to use MD5(rowkey)-rowkey for data that has
> non uniform row keys. I think it's reasonable to use if most data is
> uniform and benefits from range scans, but a few things are added that
> aren't/don't. This trick does make the keys larger, which increases
> storage cost and IO load, so it's probably a bad idea if a significant
> subset of the data requires it.
> Disclaimer - I wrote that wiki article to fill in a documentation gap,
> since there were no examples of BOP and I wasted a lot of time before I
> noticed the hex byte array vs decimal distinction for specifying the
> initial tokens (which to be fair is documented, just easy to miss on a
> skim). I'm also new to cassandra, I'm just describing what makes sense
> to me "on paper". FWIW I confirmed that random UUIDs (type 4) row keys
> really do evenly distribute when using BOP.
> -Bryce
> On Mon, 19 Dec 2011 19:01:00 -0800
> Drew Kutcharian <> wrote:
>> Hey Guys,
>> I just came across
>> and it got me
>> thinking. If the row keys are java.util.UUID which are generated
>> randomly (and securely), then what type of partitioner would be the
>> best? Since the key values are already random, would it make a
>> difference to use RandomPartitioner or one can use
>> ByteOrderedPartitioner or OrderPreservingPartitioning as well and get
>> the same result?
>> -- Drew

View raw message