Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01CAF03A.CA2C1458"
Subject: RE: trying to make my ideas clear about partionning...
Date: Mon, 10 May 2010 14:17:42 +0200
In-Reply-To: <AANLkTil1Vha_OH63ba5etCGB2TAP7dwAhmcAeSXVvFKX@mail.gmail.com>
Thread-Topic: trying to make my ideas clear about partionning...
Thread-Index: AcrwMmng7+zotUKqTPS9s7qwnrjDEgACArqw
References: <AANLkTil1Vha_OH63ba5etCGB2TAP7dwAhmcAeSXVvFKX@mail.gmail.com>
From: =?iso-8859-1?Q?Dr=2E_Martin_Grabm=FCller?=
 <Martin.Grabmueller@eleven.de>
To: <user@cassandra.apache.org>
Message-ID: <1OBRva-0002wW-Mf@mail.eleven.de>

This is a multi-part message in MIME format.

------_=_NextPart_001_01CAF03A.CA2C1458
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Partitioning is only done for row keys,  the part in your message about =
keys and partitioning is correct.
There is no partitioning for columns, all columns for a particular key =
are stored on the same node (plus
replicas, of course, which are stored on different nodes).  The =
CompareWith option for column families
only affects the ordering of columns/supercolumns, not the partitioning.
=20
Cheers,
  Martin


________________________________

	From: Olivier Mallassi [mailto:omallassi@octo.com]=20
	Sent: Monday, May 10, 2010 1:17 PM
	To: user@cassandra.apache.org
	Subject: trying to make my ideas clear about partionning...
=09
=09
	Hi all, =20
=09
=09
	I am trying to make my ideas clear about how the partioning works in =
Cassandra.=20
=09
=09
	Here is what I understood, pease correct me if I am wrong.=20
=09
=09
	- Row key are partitionned based on the partitionning strategy you =
choose (randon, order preserving, custom if you implemented the =
IPartioner interface). One partionning strategy is defined per cluster =
(in fact for each node of the cluster but the confifguration should be =
the sames so...)
	Order Preserving Partionning is better for range queries because the =
key are stored in a sequential way so when selecting a range of keys, =
you hit less nodes than with the RandomPartitioner.
=09
=09
	- Once this first partitioning is done, a second one is done based on =
the Column (or SuperColumn) name and the CompareWith you defined for the =
ColumnFamily.  =20
=09
=09
	Am I right?=20
	Am I wrong if I say that potentially the different columns of the same =
ColumnFamily are stored on different nodes? So if I wanna read a =
complete row, I hit several nodes.=20
	Is there a way of controlling the way Column are stored?
=09
=09
	Thanks for your help.=20
=09
=09
	Oliv/


------_=_NextPart_001_01CAF03A.CA2C1458
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META name=3DGENERATOR content=3D"MSHTML 8.00.6001.18904"></HEAD>
<BODY>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>Partitioning is only done for row keys,&nbsp; =
the part=20
in your message about keys and partitioning is =
correct.</SPAN></FONT></DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>There is no partitioning for columns, all =
columns for a=20
particular key are stored on the same node (plus</SPAN></FONT></DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>replicas, of course, which are stored on =
different=20
nodes).&nbsp; The CompareWith option for column =
families</SPAN></FONT></DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>only affects the ordering of =
columns/supercolumns, not=20
the partitioning.</SPAN></FONT></DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010></SPAN></FONT>&nbsp;</DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>Cheers,</SPAN></FONT></DIV>
<DIV dir=3Dltr align=3Dleft><FONT color=3D#0000ff size=3D2 =
face=3DArial><SPAN=20
class=3D161191512-10052010>&nbsp; Martin</SPAN></FONT></DIV><FONT =
color=3D#0000ff=20
size=3D2 face=3DArial></FONT><BR>
<BLOCKQUOTE=20
style=3D"BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT: =
5px; MARGIN-RIGHT: 0px"=20
dir=3Dltr>
  <DIV dir=3Dltr lang=3Dde class=3DOutlookMessageHeader align=3Dleft>
  <HR tabIndex=3D-1>
  <FONT size=3D2 face=3DTahoma><B>From:</B> Olivier Mallassi=20
  [mailto:omallassi@octo.com] <BR><B>Sent:</B> Monday, May 10, 2010 1:17 =

  PM<BR><B>To:</B> user@cassandra.apache.org<BR><B>Subject:</B> trying =
to make=20
  my ideas clear about partionning...<BR></FONT><BR></DIV>
  <DIV></DIV><FONT size=3D2><FONT face=3Darial,helvetica,sans-serif>Hi=20
  all,&nbsp;</FONT></FONT>
  <DIV><FONT size=3D2><FONT=20
  face=3Darial,helvetica,sans-serif><BR></FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT face=3Darial,helvetica,sans-serif>I am =
trying to make my=20
  ideas clear about how the partioning works in=20
  Cassandra.&nbsp;</FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT=20
  face=3Darial,helvetica,sans-serif><BR></FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT face=3Darial,helvetica,sans-serif>Here is =
what I=20
  understood, pease correct me if I am wrong.&nbsp;</FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT=20
  face=3Darial,helvetica,sans-serif><BR></FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT face=3Darial,helvetica,sans-serif>- Row key =
are=20
  partitionned based on the partitionning strategy you choose (randon, =
order=20
  preserving, custom if you implemented the IPartioner interface). One=20
  partionning strategy is defined per cluster (in fact for each node of =
the=20
  cluster but the confifguration should be the sames =
so...)</FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT face=3Darial,helvetica,sans-serif>Order =
Preserving=20
  Partionning is better for range queries because the key are stored in =
a=20
  sequential way so when selecting a range of keys, you hit less nodes =
than with=20
  the RandomPartitioner.</FONT></FONT></DIV>
  <DIV><FONT size=3D2><FONT=20
  face=3Darial,helvetica,sans-serif><BR></FONT></FONT></DIV>
  <DIV><FONT size=3D2 face=3DArial>- Once this first partitioning is =
done, a second=20
  one is done based on the Column (or SuperColumn) name and the =
CompareWith you=20
  defined for the ColumnFamily.&nbsp;<SPAN =
class=3D161191512-10052010><FONT=20
  color=3D#0000ff>&nbsp;</FONT></SPAN></FONT><FONT size=3D2 =
face=3DArial><SPAN=20
  class=3D161191512-10052010>&nbsp;</SPAN><BR =
clear=3Dall><BR></DIV></FONT>
  <DIV><FONT class=3DApple-style-span face=3D"arial, helvetica, =
sans-serif">Am I=20
  right?&nbsp;</FONT></DIV>
  <DIV><FONT class=3DApple-style-span face=3D"arial, helvetica, =
sans-serif">Am I=20
  wrong if I say that potentially the different columns of the same =
ColumnFamily=20
  are stored on different nodes? So if I wanna read a complete row, I =
hit=20
  several nodes.&nbsp;</FONT></DIV>
  <DIV><FONT class=3DApple-style-span face=3D"arial, helvetica, =
sans-serif">Is there=20
  a way of controlling the way Column are stored?</FONT></DIV>
  <DIV><FONT class=3DApple-style-span=20
  face=3D"arial, helvetica, sans-serif"><BR></FONT></DIV>
  <DIV><FONT class=3DApple-style-span face=3D"arial, helvetica, =
sans-serif">Thanks=20
  for your help.&nbsp;</FONT></DIV>
  <DIV><FONT size=3D2><FONT=20
  face=3Darial,helvetica,sans-serif><BR></FONT></FONT></DIV>
  <DIV>Oliv/</DIV>
  <DIV><BR></DIV></BLOCKQUOTE></BODY></HTML>

------_=_NextPart_001_01CAF03A.CA2C1458--