cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: Data partitioning and composite partition key
Date Sat, 30 Aug 2014 00:46:28 GMT
But you already said that your have “very wide rows”, so pulling massive amounts of data
off a single node is very likely to completely dwarf the connect time. Again, doing the gets
in parallel from multiple nodes, with parallel requests, would be so much more performant.
How many nodes are we talking about?

One of the secrets of Cassandra is to use more, smaller requests in parallel, rather than
massive requests to a single coordinator node.

-- Jack Krupansky

From: Drew Kutcharian 
Sent: Friday, August 29, 2014 8:28 PM
Subject: Re: Data partitioning and composite partition key

Mainly lower latency and (network overhead) in multi-get requests (WHERE IN (….)). The coordinator
needs to connect only to one node vs potentially all the nodes in the cluster. 

On Aug 29, 2014, at 5:23 PM, Jack Krupansky <> wrote:

  Okay, but what benefit do you think you get from having the partitions on the same node
– since they would be separate partitions anyway? I mean, what exactly do you think you’re
going to do with them, that wouldn’t be a whole lot more performant by being able to process
data in parallel from separate nodes? I mean, the whole point of Cassandra is scalability
and distributed processing, right?

  -- Jack Krupansky

  From: Drew Kutcharian 
  Sent: Friday, August 29, 2014 7:31 PM
  Subject: Re: Data partitioning and composite partition key

  Hi Jack, 

  I think you missed the point of my email which was trying to avoid the problem of having
very wide rows :)  In the notation of sensorId-datatime, the datatime is a datetime bucket,
say a day. The CQL rows would still be keyed by the actual time of the event. So you’d end
up having SesonId->Datetime Bucket (day/week/month)->actual event. What I wanted to
be able to do was to colocate all the events related to a sensor id on a single node (token).

  See "High Throughput Timelines” at

  - Drew

  On Aug 29, 2014, at 3:58 PM, Jack Krupansky <> wrote:

    With CQL3, you, the developer, get to decide whether to place a primary key column in
the partition key or as a clustering column. So, make sensorID the partition key and datetime
as a clustering column.

    -- Jack Krupansky

    From: Drew Kutcharian 
    Sent: Friday, August 29, 2014 6:48 PM
    Subject: Data partitioning and composite partition key

    Hey Guys, 

    AFAIK, currently Cassandra partitions (thrift) rows using the row key, basically uses
the hash(row_key) to decide what node that row needs to be stored on. Now there are times
when there is a need to shard a wide row, say storing events per sensor, so you’d have sensorId-datetime
row key so you don’t end up with very large rows. Is there a way to have the partitioner
use only the “sensorId” part of the row key for the hash? This way we would be able to
store all the data relating to a sensor in one node.

    Another use case of this would be multi-tenancy:

    Say we have accounts and accounts have users. So we would have the following tables:

    CREATE TABLE account (
      id                     timeuuid PRIMARY KEY,
      company         text      //timezone

    CREATE TABLE user (
      id              timeuuid PRIMARY KEY, 
      accountId timeuuid,
      email        text,
      password text

    // Get users by account
    CREATE TABLE user_account_index (
      accountId  timeuuid,
      userId        timeuuid,
      PRIMARY KEY(acid, id)

    Say I want to get all the users that belong to an account. I would first have to get the
results from user_account_index and then use a multi-get (WHERE IN) to get the records from
user table. Now this multi-get part could potentially query a lot of different nodes in the
cluster. It’d be great if there was a way to limit storage of users of an account to a single
node so that way multi-get would only need to query a single node. 

    Note that the problem cannot be simply fixed by using (accountId, id) as the primary key
for the user table since that would create a problem of having a very large number of (thrift)
rows in the users table.

    I did look thru the code and JIRA and I couldn’t really find a solution. The closest
I got was to have a custom partitioner, but then you can’t have a partitioner per keyspace
and that’s not even something that’d be implemented in future based on the following JIRA:

    Any ideas are much appreciated.



View raw message