Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 7168 invoked from network); 10 May 2010 12:18:12 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 May 2010 12:18:12 -0000 Received: (qmail 24955 invoked by uid 500); 10 May 2010 12:18:11 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 24931 invoked by uid 500); 10 May 2010 12:18:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24923 invoked by uid 99); 10 May 2010 12:18:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 May 2010 12:18:11 +0000 X-ASF-Spam-Status: No, hits=1.0 required=10.0 tests=AWL,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_HK_NAME_DR X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [195.190.135.20] (HELO mx.expurgate.net) (195.190.135.20) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 May 2010 12:18:04 +0000 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CAF03A.CA2C1458" Subject: RE: trying to make my ideas clear about partionning... Date: Mon, 10 May 2010 14:17:42 +0200 In-Reply-To: Thread-Topic: trying to make my ideas clear about partionning... Thread-Index: AcrwMmng7+zotUKqTPS9s7qwnrjDEgACArqw References: From: =?iso-8859-1?Q?Dr=2E_Martin_Grabm=FCller?= To: Message-ID: <1OBRva-0002wW-Mf@mail.eleven.de> This is a multi-part message in MIME format. ------_=_NextPart_001_01CAF03A.CA2C1458 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Partitioning is only done for row keys, the part in your message about = keys and partitioning is correct. There is no partitioning for columns, all columns for a particular key = are stored on the same node (plus replicas, of course, which are stored on different nodes). The = CompareWith option for column families only affects the ordering of columns/supercolumns, not the partitioning. =20 Cheers, Martin ________________________________ From: Olivier Mallassi [mailto:omallassi@octo.com]=20 Sent: Monday, May 10, 2010 1:17 PM To: user@cassandra.apache.org Subject: trying to make my ideas clear about partionning... =09 =09 Hi all, =20 =09 =09 I am trying to make my ideas clear about how the partioning works in = Cassandra.=20 =09 =09 Here is what I understood, pease correct me if I am wrong.=20 =09 =09 - Row key are partitionned based on the partitionning strategy you = choose (randon, order preserving, custom if you implemented the = IPartioner interface). One partionning strategy is defined per cluster = (in fact for each node of the cluster but the confifguration should be = the sames so...) Order Preserving Partionning is better for range queries because the = key are stored in a sequential way so when selecting a range of keys, = you hit less nodes than with the RandomPartitioner. =09 =09 - Once this first partitioning is done, a second one is done based on = the Column (or SuperColumn) name and the CompareWith you defined for the = ColumnFamily. =20 =09 =09 Am I right?=20 Am I wrong if I say that potentially the different columns of the same = ColumnFamily are stored on different nodes? So if I wanna read a = complete row, I hit several nodes.=20 Is there a way of controlling the way Column are stored? =09 =09 Thanks for your help.=20 =09 =09 Oliv/ ------_=_NextPart_001_01CAF03A.CA2C1458 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Partitioning is only done for row keys,  = the part=20 in your message about keys and partitioning is = correct.
There is no partitioning for columns, all = columns for a=20 particular key are stored on the same node (plus
replicas, of course, which are stored on = different=20 nodes).  The CompareWith option for column = families
only affects the ordering of = columns/supercolumns, not=20 the partitioning.
 
Cheers,
  Martin


From: Olivier Mallassi=20 [mailto:omallassi@octo.com]
Sent: Monday, May 10, 2010 1:17 = PM
To: user@cassandra.apache.org
Subject: trying = to make=20 my ideas clear about partionning...

Hi=20 all, 

I am = trying to make my=20 ideas clear about how the partioning works in=20 Cassandra. 

Here is = what I=20 understood, pease correct me if I am wrong. 

- Row key = are=20 partitionned based on the partitionning strategy you choose (randon, = order=20 preserving, custom if you implemented the IPartioner interface). One=20 partionning strategy is defined per cluster (in fact for each node of = the=20 cluster but the confifguration should be the sames = so...)
Order = Preserving=20 Partionning is better for range queries because the key are stored in = a=20 sequential way so when selecting a range of keys, you hit less nodes = than with=20 the RandomPartitioner.

- Once this first partitioning is = done, a second=20 one is done based on the Column (or SuperColumn) name and the = CompareWith you=20 defined for the ColumnFamily.   

Am I=20 right? 
Am I=20 wrong if I say that potentially the different columns of the same = ColumnFamily=20 are stored on different nodes? So if I wanna read a complete row, I = hit=20 several nodes. 
Is there=20 a way of controlling the way Column are stored?

Thanks=20 for your help. 

Oliv/

------_=_NextPart_001_01CAF03A.CA2C1458--