incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Mishra <mishra.v...@gmail.com>
Subject Re: Exactly one wide row per node for a given CF?
Date Wed, 04 Dec 2013 07:32:18 GMT
So Basically you want to create a cluster of multiple unique keys, but data
which belongs to one unique should be colocated. correct?

-Vivek


On Tue, Dec 3, 2013 at 10:39 AM, onlinespending <onlinespending@gmail.com>wrote:

> Subject says it all. I want to be able to randomly distribute a large set
> of records but keep them clustered in one wide row per node.
>
> As an example, lets say I’ve got a collection of about 1 million records
> each with a unique id. If I just go ahead and set the primary key (and
> therefore the partition key) as the unique id, I’ll get very good random
> distribution across my server cluster. However, each record will be its own
> row. I’d like to have each record belong to one large wide row (per server
> node) so I can have them sorted or clustered on some other column.
>
> If I say have 5 nodes in my cluster, I could randomly assign a value of 1
> - 5 at the time of creation and have the partition key set to this value.
> But this becomes troublesome if I add or remove nodes. What effectively I
> want is to partition on the unique id of the record modulus N (id % N;
> where N is the number of nodes).
>
> I have to imagine there’s a mechanism in Cassandra to simply randomize the
> partitioning without even using a key (and then clustering on some column).
>
> Thanks for any help.

Mime
View raw message