Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 47941 invoked from network); 26 Apr 2010 15:56:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Apr 2010 15:56:04 -0000 Received: (qmail 49212 invoked by uid 500); 26 Apr 2010 15:56:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 49196 invoked by uid 500); 26 Apr 2010 15:56:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 49188 invoked by uid 99); 26 Apr 2010 15:56:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 15:56:03 +0000 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of zsongbo@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Apr 2010 15:55:57 +0000 Received: by vws13 with SMTP id 13so1262159vws.31 for ; Mon, 26 Apr 2010 08:55:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=JipST8M3vj3r9CG10P7K6QmwrVcksnJ8RehACcVfmxg=; b=g7kG1FXVeRkF6Vw3NVEmkmXWJnkrR+CbfSEBKRTOuVxoPHf5O0gA++wFuAk/lCaSyi hK9I9SGyk4KaLmiwTgDoR/sjTPfTVdcheM4f2dwpg/k2IYhRigUD8YK2A3Rq5ILRsmH3 OoTv31ycjdQSzKbARIHhtZ8Wm11zn3lWUHOvk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=vBcVawQ70aVwVEIzmBJWslyQmKfYcAv6fGz99ZMmCJM9xrqxxdKZdak8TvIQjGaxjK 4DqomYky6JAv0AQvX7Ln8HuqCvm+1dZNrGngtLS2sCQQoUkPHBNfxE0FkiteP/Dn67l5 M0JRQQRZ4+YnKUClzqXiJ4dT27l8DNFvgyVUw= MIME-Version: 1.0 Received: by 10.229.190.213 with SMTP id dj21mr5003721qcb.66.1272297336590; Mon, 26 Apr 2010 08:55:36 -0700 (PDT) Received: by 10.229.91.137 with HTTP; Mon, 26 Apr 2010 08:55:36 -0700 (PDT) In-Reply-To: References: Date: Mon, 26 Apr 2010 23:55:36 +0800 Message-ID: Subject: Re: Trying To Understand get_range_slices Results When Using RandomPartitioner From: Schubert Zhang To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001636283b68cbd449048525ceb2 --001636283b68cbd449048525ceb2 Content-Type: text/plain; charset=ISO-8859-1 RandomPartioner is for row-keys. #1 no #2 yes #3 yes On Sat, Apr 24, 2010 at 4:33 AM, Larry Root wrote: > I trying to better understand how using the RandomPartitioner will affect > my ability to select ranges of keys. Consider my simple example where we > have many online games across different game genres (GameType). These games > need to store data for each one of their users. With that in mind consider > the following data model: > > enum GameType {'RPG', 'FPS', 'ARCADE'} > > { > "GameData": { // Super Column Family > > *GameType+"1234"*: { // Row (concat gametype with a > game id for example) > *"user-data:5678"*:{ // Super column (user data) > *"user_prop_name"*: "value",// Subcolumn (arbitrary user > properties and values) > * "another_prop_name"*: "value", > ... > }, > *"user-data:9012"*:{ > *"**user_prop_name**"*: "value", > ... > } > }, > > * GameType+"3456"*: {...}, > *GameType+"7890"*: {...}, > ... > } > } > > Assume we have a multi node cluster running Cassandra 0.6.1. In that > scenario could some one help me understand what the result would be in the > following cases: > > 1. We use a range slice to grab keys for all 'RPG' games (range slice > at the ROW level). Would we be able to get all games back in a single query > or would that not be guaranteed? > > 2. For a given game we use a range slice to grab all user-data keys in > which the ID starts with '5' (range slice at the COLUMN level). Again, would > we be able to get all keys in one call (assuming number of keys in the > result was not an issue)? > > 3. Finally for a given game and a given user we do a range slice to > grab all user properties that start with 'a' (range slice at the SUBCOLUMN > level of a SUPERCOLUMN). Is that possible in one call? > > I'm trying to understand at what level the RandomPartioner affects my > example data model. Is it at a fixed level like just ROWS (the sub data is > fixed to the same node) or is all data at every level *randomized* across > all nodes. > > Are there any tricks to doing these sort of range slices using RP? For > example if I set my consistency level to 'ALL' when doing a range slice > would that effectively compile a complete result set for me? > > Thanks for the help! > > larry --001636283b68cbd449048525ceb2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable RandomPartioner=A0 is for row-keys.

#1=A0 no
#2 yes
#3 yes
=
On Sat, Apr 24, 2010 at 4:33 AM, Larry Root = <larry@armorga= mes.com> wrote:
I trying to bette= r understand how using the RandomPartitioner will affect my ability to sele= ct ranges of keys. Consider my simple example where we have many online gam= es across different game genres (GameType). These games need to store data = for each one of their users. With that in mind consider the following data = model:

en= um GameType {'RPG', 'FPS', 'ARCADE'}

{
=A0=A0=A0 "GameData": {=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 // Super Column Family
=A0=A0=A0 =A0=A0=A0= GameType+"1234": {=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 // Row (concat gametype with a game id for example)

=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 "user-data:5678&quo= t;:{=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 // Super column (user data)
=A0=A0=A0 =A0=A0=A0 =A0= =A0=A0 =A0=A0=A0 "user_prop_name": "value",
// Subcolumn (arbitrary user properties and values)
=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 "another_prop_name": "value",=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ...
=A0=A0=A0 =A0= =A0=A0 =A0=A0=A0 },

=A0=A0=A0 =A0=A0=A0 = =A0=A0=A0 "user-data:9012":{
=A0=A0=A0 =A0=A0=A0 =A0= =A0=A0 =A0=A0=A0 "
user_prop_name"= ;: "value",
=A0=A0=A0 =A0=A0=A0 =A0= =A0=A0 =A0=A0=A0=A0 ...
=A0=A0=A0 =A0=A0= =A0 =A0=A0=A0 }
=A0=A0=A0 =A0=A0=A0 },<= br>
=A0=A0=A0 =A0=A0=A0 GameType+"3456": {...},
=A0=A0=A0=A0=A0=A0=A0 GameType+= "7890": {...},
=A0=A0=A0=A0=A0=A0=A0 ...=A0=A0=A0 }
}

= Assume we have a multi node cluster running Cassandra 0.6.1. In that scenar= io could some one help me understand what the result would be in the follow= ing cases:
  1. We use a range slice to grab keys for all 'RPG' games (rang= e slice at the ROW level). Would we be able to get all games back in a sing= le query or would that not be guaranteed?

  2. For a given game = we use a range slice to grab all user-data keys in which the ID starts with= '5' (range slice at the COLUMN level). Again, would we be able to = get all keys in one call (assuming number of keys in the result was not an = issue)?

  3. Finally for a given game and a given user we do a range slice = to grab all user properties that start with 'a' (range slice at the= SUBCOLUMN level of a SUPERCOLUMN). Is that possible in one call?
I'm trying to understand at what level the RandomPartioner affects my e= xample data model. Is it at a fixed level like just ROWS (the sub data is f= ixed to the same node) or is all data at every level *randomized* across al= l nodes.

Are there any tricks to doing these sort of range slices using RP? For = example if I set my consistency level to 'ALL' when doing a range s= lice would that effectively compile a complete result set for me?

Thanks for the help!

larry

--001636283b68cbd449048525ceb2--