Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of
 SRS0=FFWQ5C=5Y=basetechnology.com=jack@yourhostingaccount.com designates
 65.254.254.75 as permitted sender)
Message-ID: <787B79DEF3BB4984873824622933161F@JackKrupansky14>
From: "Jack Krupansky" <jack@basetechnology.com>
To: <user@cassandra.apache.org>
References: <94971CB5-9CE3-4119-A788-8CF1D4797EB5@venarc.com>
 <F5AF7155B132490A83A5FFFC4CB60BA9@JackKrupansky14>
 <F4A19611-56EF-4737-86CF-492259AB74C1@venarc.com>
In-Reply-To: <F4A19611-56EF-4737-86CF-492259AB74C1@venarc.com>
Subject: Re: Data partitioning and composite partition key
Date: Fri, 29 Aug 2014 20:23:02 -0400
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_0619_01CFC3C7.083430A0"
Importance: Normal
Sender: "Jack Krupansky" <jack@basetechnology.com>

This is a multi-part message in MIME format.

------=_NextPart_000_0619_01CFC3C7.083430A0
Content-Type: text/plain;
	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

Okay, but what benefit do you think you get from having the partitions =
on the same node =96 since they would be separate partitions anyway? I =
mean, what exactly do you think you=92re going to do with them, that =
wouldn=92t be a whole lot more performant by being able to process data =
in parallel from separate nodes? I mean, the whole point of Cassandra is =
scalability and distributed processing, right?

-- Jack Krupansky

From: Drew Kutcharian=20
Sent: Friday, August 29, 2014 7:31 PM
To: user@cassandra.apache.org=20
Subject: Re: Data partitioning and composite partition key

Hi Jack,=20

I think you missed the point of my email which was trying to avoid the =
problem of having very wide rows :)  In the notation of =
sensorId-datatime, the datatime is a datetime bucket, say a day. The CQL =
rows would still be keyed by the actual time of the event. So you=92d =
end up having SesonId->Datetime Bucket (day/week/month)->actual event. =
What I wanted to be able to do was to colocate all the events related to =
a sensor id on a single node (token).

See "High Throughput Timelines=94 at =
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

- Drew


On Aug 29, 2014, at 3:58 PM, Jack Krupansky <jack@basetechnology.com> =
wrote:


  With CQL3, you, the developer, get to decide whether to place a =
primary key column in the partition key or as a clustering column. So, =
make sensorID the partition key and datetime as a clustering column.

  -- Jack Krupansky

  From: Drew Kutcharian=20
  Sent: Friday, August 29, 2014 6:48 PM
  To: user@cassandra.apache.org=20
  Subject: Data partitioning and composite partition key

  Hey Guys,=20

  AFAIK, currently Cassandra partitions (thrift) rows using the row key, =
basically uses the hash(row_key) to decide what node that row needs to =
be stored on. Now there are times when there is a need to shard a wide =
row, say storing events per sensor, so you=92d have sensorId-datetime =
row key so you don=92t end up with very large rows. Is there a way to =
have the partitioner use only the =93sensorId=94 part of the row key for =
the hash? This way we would be able to store all the data relating to a =
sensor in one node.

  Another use case of this would be multi-tenancy:

  Say we have accounts and accounts have users. So we would have the =
following tables:

  CREATE TABLE account (
    id                     timeuuid PRIMARY KEY,
    company         text      //timezone
  );

  CREATE TABLE user (
    id              timeuuid PRIMARY KEY,=20
    accountId timeuuid,
    email        text,
    password text
  );

  // Get users by account
  CREATE TABLE user_account_index (
    accountId  timeuuid,
    userId        timeuuid,
    PRIMARY KEY(acid, id)
  );

  Say I want to get all the users that belong to an account. I would =
first have to get the results from user_account_index and then use a =
multi-get (WHERE IN) to get the records from user table. Now this =
multi-get part could potentially query a lot of different nodes in the =
cluster. It=92d be great if there was a way to limit storage of users of =
an account to a single node so that way multi-get would only need to =
query a single node.=20

  Note that the problem cannot be simply fixed by using (accountId, id) =
as the primary key for the user table since that would create a problem =
of having a very large number of (thrift) rows in the users table.

  I did look thru the code and JIRA and I couldn=92t really find a =
solution. The closest I got was to have a custom partitioner, but then =
you can=92t have a partitioner per keyspace and that=92s not even =
something that=92d be implemented in future based on the following JIRA:
  https://issues.apache.org/jira/browse/CASSANDRA-295

  Any ideas are much appreciated.

  Best,

  Drew

------=_NextPart_000_0619_01CFC3C7.083430A0
Content-Type: text/html;
	charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

<HTML><HEAD>
<META content=3D"text/html charset=3Dwindows-1252" =
http-equiv=3DContent-Type></HEAD>
<BODY=20
style=3D"WORD-WRAP: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space"=20
dir=3Dltr>
<DIV dir=3Dltr>
<DIV style=3D"FONT-SIZE: 12pt; FONT-FAMILY: 'Calibri'; COLOR: #000000">
<DIV>Okay, but what benefit do you think you get from having the =
partitions on=20
the same node =96 since they would be separate partitions anyway? I =
mean, what=20
exactly do you think you=92re going to do with them, that wouldn=92t be =
a whole lot=20
more performant by being able to process data in parallel from separate =
nodes? I=20
mean, the whole point of Cassandra is scalability and distributed =
processing,=20
right?</DIV>
<DIV>&nbsp;</DIV>
<DIV style=3D"FONT-SIZE: 12pt; FONT-FAMILY: 'Calibri'; COLOR: =
#000000">-- Jack=20
Krupansky</DIV>
<DIV=20
style=3D'FONT-SIZE: small; TEXT-DECORATION: none; FONT-FAMILY: =
"Calibri"; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; =
DISPLAY: inline'>
<DIV style=3D"FONT: 10pt tahoma">
<DIV>&nbsp;</DIV>
<DIV style=3D"BACKGROUND: #f5f5f5">
<DIV style=3D"font-color: black"><B>From:</B> <A title=3Ddrew@venarc.com =

href=3D"mailto:drew@venarc.com">Drew Kutcharian</A> </DIV>
<DIV><B>Sent:</B> Friday, August 29, 2014 7:31 PM</DIV>
<DIV><B>To:</B> <A title=3Duser@cassandra.apache.org=20
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</A> =
</DIV>
<DIV><B>Subject:</B> Re: Data partitioning and composite partition=20
key</DIV></DIV></DIV>
<DIV>&nbsp;</DIV></DIV>
<DIV=20
style=3D'FONT-SIZE: small; TEXT-DECORATION: none; FONT-FAMILY: =
"Calibri"; FONT-WEIGHT: normal; COLOR: #000000; FONT-STYLE: normal; =
DISPLAY: inline'>Hi=20
Jack,=20
<DIV>&nbsp;</DIV>
<DIV>I think you missed the point of my email which was trying to avoid =
the=20
problem of having very wide rows :)&nbsp; In the notation of =
sensorId-datatime,=20
the datatime is a datetime bucket, say a day. The CQL rows would still =
be keyed=20
by the actual time of the event. So you=92d end up having =
SesonId-&gt;Datetime=20
Bucket (day/week/month)-&gt;actual event. What I wanted to be able to do =
was to=20
colocate all the events related to a sensor id on a single node =
(token).</DIV>
<DIV>&nbsp;</DIV>
<DIV>See "High Throughput Timelines=94 at <A=20
href=3D"http://www.datastax.com/dev/blog/advanced-time-series-with-cassan=
dra">http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra=
</A></DIV>
<DIV>&nbsp;</DIV>
<DIV>- Drew</DIV>
<DIV>&nbsp;</DIV>
<DIV>
<DIV>&nbsp;</DIV>
<DIV>
<DIV>On Aug 29, 2014, at 3:58 PM, Jack Krupansky &lt;<A=20
href=3D"mailto:jack@basetechnology.com">jack@basetechnology.com</A>&gt;=20
wrote:</DIV><BR class=3DApple-interchange-newline>
<BLOCKQUOTE type=3D"cite">
  <DIV=20
  style=3D"WORD-WRAP: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space"=20
  dir=3Dltr>
  <DIV dir=3Dltr>
  <DIV style=3D"FONT-SIZE: 12pt; FONT-FAMILY: calibri">
  <DIV>With CQL3, you, the developer, get to decide whether to place a =
primary=20
  key column in the partition key or as a clustering column. So, make =
sensorID=20
  the partition key and datetime as a clustering column.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV style=3D"FONT-SIZE: 12pt; FONT-FAMILY: calibri">-- Jack =
Krupansky</DIV>
  <DIV=20
  style=3D"FONT-SIZE: small; TEXT-DECORATION: none; FONT-FAMILY: =
calibri; FONT-WEIGHT: normal; FONT-STYLE: normal; DISPLAY: inline">
  <DIV style=3D"FONT: 10pt tahoma">
  <DIV>&nbsp;</DIV>
  <DIV style=3D"BACKGROUND: #f5f5f5">
  <DIV style=3D"font-color: black"><B>From:</B> <A =
title=3Ddrew@venarc.com=20
  href=3D"mailto:drew@venarc.com">Drew Kutcharian</A> </DIV>
  <DIV><B>Sent:</B> Friday, August 29, 2014 6:48 PM</DIV>
  <DIV><B>To:</B> <A title=3Duser@cassandra.apache.org=20
  =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</A> =
</DIV>
  <DIV><B>Subject:</B> Data partitioning and composite partition=20
  key</DIV></DIV></DIV>
  <DIV>&nbsp;</DIV></DIV>
  <DIV=20
  style=3D"FONT-SIZE: small; TEXT-DECORATION: none; FONT-FAMILY: =
calibri; FONT-WEIGHT: normal; FONT-STYLE: normal; DISPLAY: inline">Hey=20
  Guys,=20
  <DIV>&nbsp;</DIV>
  <DIV>AFAIK, currently Cassandra partitions (thrift) rows using the row =
key,=20
  basically uses the hash(row_key) to decide what node that row needs to =
be=20
  stored on. Now there are times when there is a need to shard a wide =
row, say=20
  storing events per sensor, so you=92d have sensorId-datetime row key =
so you=20
  don=92t end up with very large rows. Is there a way to have the =
partitioner use=20
  only the =93sensorId=94 part of the row key for the hash? This way we =
would be=20
  able to store all the data relating to a sensor in one node.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Another use case of this would be multi-tenancy:</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Say we have accounts and accounts have users. So we would have =
the=20
  following tables:</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>
  <DIV>CREATE TABLE account (</DIV>
  <DIV>&nbsp;=20
  =
id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
  timeuuid PRIMARY KEY,</DIV>
  <DIV>&nbsp; company&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;=20
  text&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //timezone</DIV>
  <DIV>);</DIV></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>
  <DIV>CREATE TABLE user (</DIV>
  <DIV>&nbsp;=20
  =
id&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;=20
  timeuuid PRIMARY KEY, </DIV>
  <DIV>&nbsp; accountId timeuuid,</DIV>
  <DIV>
  <DIV>&nbsp; email&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
text,</DIV></DIV>
  <DIV>
  <DIV>&nbsp; password text</DIV></DIV>
  <DIV>);</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>// Get users by account</DIV>
  <DIV>CREATE TABLE user_account_index (</DIV>
  <DIV>&nbsp; accountId&nbsp; timeuuid,</DIV>
  <DIV>&nbsp; userId&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
timeuuid,</DIV>
  <DIV>&nbsp; PRIMARY KEY(acid, id)</DIV>
  <DIV>);</DIV></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Say I want to get all the users that belong to an account. I =
would first=20
  have to get the results from user_account_index and then use a =
multi-get=20
  (WHERE IN) to get the records from user table. Now this multi-get part =
could=20
  potentially query a lot of different nodes in the cluster. It=92d be =
great if=20
  there was a way to limit storage of users of an account to a single =
node so=20
  that way multi-get would only need to query a single node. </DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Note that the problem cannot be simply fixed by using (accountId, =
id) as=20
  the primary key for the user table since that would create a problem =
of having=20
  a very large number of (thrift) rows in the users table.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>I did look thru the code and JIRA and I couldn=92t really find a =
solution.=20
  The closest I got was to have a custom partitioner, but then you =
can=92t have a=20
  partitioner per keyspace and that=92s not even something that=92d be =
implemented=20
  in future based on the following JIRA:</DIV>
  <DIV><A=20
  =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-295">https://issu=
es.apache.org/jira/browse/CASSANDRA-295</A></DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Any ideas are much appreciated.</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Best,</DIV>
  <DIV>&nbsp;</DIV>
  <DIV>Drew</DIV></DIV></DIV></DIV></DIV></BLOCKQUOTE></DIV>
<DIV>&nbsp;</DIV></DIV></DIV></DIV></DIV></BODY></HTML>

------=_NextPart_000_0619_01CFC3C7.083430A0--