From user-return-7663-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Jul 16 04:38:09 2010 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 78346 invoked from network); 16 Jul 2010 04:38:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Jul 2010 04:38:09 -0000 Received: (qmail 96261 invoked by uid 500); 16 Jul 2010 04:38:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 96123 invoked by uid 500); 16 Jul 2010 04:38:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 96115 invoked by uid 99); 16 Jul 2010 04:38:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 04:38:04 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a57.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jul 2010 04:37:57 +0000 Received: from homiemail-a57.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a57.g.dreamhost.com (Postfix) with ESMTP id CAC67208065 for ; Thu, 15 Jul 2010 21:37:26 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:message-id:content-type:mime-version:in-reply-to:date; q=dns; s=thelastpickle.com; b=WtnkUlbJqSPi23WWE8peHd6NwQslPTG2X dvPTr8jbKj8V3YZUlqBtllZVA6SWGRhgC2oFZmpL5/MebNtp5sc5z7W5nGH7xfnI dSIhnRfF8lDY0GKHr1/kvoMLThAT3asDQy/Obm4oTZy90+h/M3iVn8LbP2FkdLEp XQqIaSAlyw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:message-id:content-type:mime-version:in-reply-to: date; s=thelastpickle.com; bh=xKlUJDxm5oiro15zudvmYaz6N68=; b=xm G1uD3JDJGF3923ttrwMPLjN/9rQs9Ki3gjF5N2qeG07Bhf1CArz8wk0YvhwZhLqZ SMhflXWuguEIlHIimevwGLaj1tfiKqOZWOFylK6mKcsg1aEw0Q+F1U+/SkzHiTLj IhqIYrPhjHLNchN/BB0of9myrIeWoPId3HuvhlqfI= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a57.g.dreamhost.com (Postfix) with ESMTPSA id BF4F3208063 for ; Thu, 15 Jul 2010 21:37:26 -0700 (PDT) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: key types and grouping related rows together X-Mailer: MobileMe Mail (1C262608) Message-id: <2f38782b-04a3-4fdb-f48f-8837a659b6b4@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--72d503da-37c2-6685-a512-85bf95e4a21c MIME-Version: 1.0 In-Reply-To: Date: Thu, 15 Jul 2010 21:37:26 -0700 (PDT) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Webmail-42--72d503da-37c2-6685-a512-85bf95e4a21c Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8; format=flowed yes, you need to maintain the secondary index your self. Send a batch_muta= tion and write the article and website article colums at the same time.=0A= =0AI think your safe up to a large number of cols, say 1M. Not sure, may t= ry to track the info down one day.=EF=BB=BF=0A=0AA=0A=0AOn 16 Jul, 2010,at= 03:39 PM, S Ahmed wrote:=0A=0A>=0A>=0A> So am I to= keep track on the # of columns for a given key in CF WebsiteArticle? i.e= if I want to do a get_slice for the first 10 OR last 10 (I would need to= know the count to get the last 10).=0A>=0A> >>Am assuming RP. There are s= ome recommendations on the number of cols per key, in the millions I think= I can never >>find it when I want it. =0A> So I would have to potentially= split the columns to another key then correct? i.e. website_id1, website_= id1-2=0A>=0A>=0A> On Thu, Jul 15, 2010 at 8:17 PM, Aaron Morton wrote:=0A>=0A> You could build a secondary index, e.g.= =0A>=0A> CF=0A> Articles : {=0A> article_id1 : {}=0A> arti= cle_id2 : {}=0A> }=0A>=0A> CF=0A> WebsiteArticle : {=0A> w= ebsite_id1 : { time_uuid : article_id1, time_uuid2 : article_id2}=0A> = }=0A>=0A> when you want to get the last 10 for a website, get_slice fr= om the WebsiteArticle CF then multi get from Articles.=0A>=0A> Am assu= ming RP. There are some recommendations on the number of cols per key, in = the millions I think I can never find it when I want it.=0A>=0A> You c= ould try a key of "webstie_id.timestamp" and try a get range using the Ran= dom Partitioner. The results will be unordered, but thats OK so long as yo= u get the ones you want.=0A>=0A> Aaron=0A>=0A>=0A>=0A> On 16 Jul, = 2010,at 09:08 AM, S Ahmed wrote:=0A>=0A>> Given= a CF like:=0A>>=0A>> Articles : {=0A>> =0A>> key1 : { tit= le:"some title", body: "this is my article body...", .... },=0A>> k= ey1 : { title:"some title", body: "this is my article body...", .... }=0A>= > }=0A>>=0A>> Now these articles could be for different websites e= g. www.website1.com, www.website2.com=0A>>=0A>> If I want to get the = latest 10 articles for a given website, how would I formulate my key to ac= hieve this?=0A>>=0A>> I basically need to understand how to handle mul= ti-tenancy, b/c I will need to do this for almost all my CF's. =0A>>=0A>> = I'm a little stuck here so guidance would be great!=0A>>=0A>>=0A>> = On Thu, Jul 15, 2010 at 4:01 PM, S Ahmed wrote:=0A= >>=0A>> Benjamin,=0A>>=0A>> Ah, thanks for clarifying that= =0A>>=0A>> key sorting is changing in .7 I believe to support a b= inary array?=0A>>=0A>>=0A>> On Thu, Jul 15, 2010 at 3:26 PM, Benja= min Black wrote:=0A>>=0A>> Keys are always sorted (= in 0.6) as UTF8 strings. The CompareWith=0A>> applies to _col= umns_ within rows, _not_ to row keys.=0A>>=0A>>=0A>> On Wed, J= ul 14, 2010 at 1:44 PM, S Ahmed wrote:=0A>> = > Where is the link that describes the various key types and their im= pact on=0A>> > sorting? (I believe I read it before, can't see= m to find it now).=0A>> > So my application supports multi-ten= ants, so I need the keys to represent=0A>> > things like:=0A>>= > website1123 + contentID=0A>> > or=0A>> = > website3454 + userID=0A>> > And for range queries, these= keys have to be grouped together obviously.=0A>> > What key t= ype would be best suited for this?=0A>> >=0A>> >=0A= >> > I might have to create a CF that maps the website and its= key prefix?=0A>>=0A>>=0A>>=0A>=0A --Apple-Webmail-42--72d503da-37c2-6685-a512-85bf95e4a21c Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--72d503da-37c2-6685-a512-85bf95e4a21c --Apple-Webmail-86--72d503da-37c2-6685-a512-85bf95e4a21c Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
yes, you need to maintain the secondary index your self. Send a =0Aba= tch_mutation and write the article and website article colums at the =0Asa= me time.

I think your safe up to a large number of cols, say =0A1M= Not sure, may try to track the info down one day.

A
On 16 Jul, 2010,at 03:39 PM, S Ahmed <sahmed1020@gmail.com> wrote:<= br>


S= o am I to keep track on the # of columns for a given key in CF WebsiteArti= cle?  i.e. if I want to do a get_slice for the first 10 OR last 10 (I= would need to know the count to get the last 10).
=0A
<= span class=3D"Apple-style-span" style=3D"font-family: arial,sans-serif; fo= nt-size: 13px; border-collapse: collapse;" _mce_style=3D"font-family: aria= l, sans-serif; font-size: 13px; border-collapse: collapse;">>>Am ass= uming RP. There are some recommendations on the number of cols per key, in= the millions I think I can never >>find it when I want it. 
=0ASo I would have to potentially split the columns to another key= then correct? i.e. website_id1, website_id1-2

On Thu, Jul 15, 2010 at 8:17 PM, Aaron Morton= <aaron@thelastpickle.com> wrote:
=0A
You could build a secondary index, e.g.

CF
Ar= ticles : {
article_id1 : {}
article_id2 : {}
=0A}

CF
We= bsiteArticle : {
website_id1 : { time_uuid : article_id1, time_uuid2 : = article_id2}
}

when you want to get the last 10 for a website, g= et_slice from the WebsiteArticle CF then multi get from Articles.
=0A<= br>Am assuming RP. There are some recommendations on the number of cols pe= r key, in the millions I think I can never find it when I want it.
You could try a key of "webstie_id.timestamp" and try a get range using t= he Random Partitioner. The results will be unordered, but thats OK so long= as you get the ones you want.
=0A
Aaron


On 16 Jul= , 2010,at 09:08 AM, S Ahmed <sahmed1020@gmail= com> wrote:
=0A
Given a = CF like:

Articles : {
 
   key1 : { title:"som= e title", body: "this is my article body...", .... },
   key1= : { title:"some title", body: "this is my article body...", ....=0A }
= }

Now these articles could be for different websites e.g. www.website1.com, www.website2.com
=0AIf I want to get the latest 10 articles for a given website, how woul= d I formulate my key to achieve this?
=0A
I basically need to unders= tand how to handle multi-tenancy, b/c I will need to do this for almost al= l my CF's. 

I'm a little stuck here so guidance would be grea= t!


=0AOn Thu, Jul 15, 2010 at 4:01 P= M, S Ahmed <sahmed1020@gmai= l.com> wrote:
=0A
Benjamin,

Ah, thanks for clarif= ying that.

key sorting is changing in .7 I believe to support a bin= ary array?
=0A=0A


On Thu, Jul 15, 2010 at 3:26 PM, Benjamin Black <<= a href=3D"mailto:b@b3k.us" _mce_href=3D"mailto:b@b3k.us" target=3D"_blank"= >b@b3k.us> wrote:
=0A
Keys are always sorted (in 0.6) = as UTF8 strings.  The CompareWith
=0Aapplies to _columns_ within r= ows, _not_ to row keys.
=0A


=0AOn Wed, Jul = 14, 2010 at 1:44 PM, S Ahmed <sahmed1020@gmai= l.com> wrote:
=0A> Where is the link that describes the vario= us key types and their impact on
=0A> sorting? (I believe I read it = before, can't seem to find it now).
=0A> So my application supports = multi-tenants, so I need the keys to represent
=0A> things like:
= =0A> website1123 + contentID
=0A> or
=0A> website3454 + use= rID
=0A> And for range queries, these keys have to be grouped togeth= er obviously.
=0A> What key type would be best suited for this?
=0A= >
=0A>
=0A> I might have to create a CF that maps the websi= te and its key prefix?
=0A

=0A
<= /div>

=0A
=

=0A
--Apple-Webmail-86--72d503da-37c2-6685-a512-85bf95e4a21c-- --Apple-Webmail-42--72d503da-37c2-6685-a512-85bf95e4a21c--