cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Robson <mar...@gmail.com>
Subject Re: OrderPreservingPartitioner limits and workarounds
Date Thu, 08 Apr 2010 14:08:35 GMT
On 7 April 2010 19:13, Jonathan Ellis <jbellis@gmail.com> wrote:

> One thing you can do is manually "randomize" keys for any CFs that
> don't need the OP by pre-pending their md5 to the key you send
> Cassandra.  (This is all RP is doing under the hood anyway.)
>

Another possibility is to prepend some hash of something that you don't need
to range scan on to the beginning of the keys.

For example, if you have thousands of customers, but they individually want
to do range scans, then you can hash the customer ID and put that at the
beginning (I use a 16-bit hex hash, it gives enough distribution with sane
amounts of nodes).

Then you'll tend to get keys which start with 0000 - ffff followed by
whatever your increasing key is (timestamp etc). Workloads should tend to
balance out but will get a bit patchy if you have, for example, a small
number of disproportionately huge customers.

Mark

Mime
View raw message