From user-return-7212-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Fri Jul 02 14:28:20 2010 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 66055 invoked from network); 2 Jul 2010 14:28:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Jul 2010 14:28:20 -0000 Received: (qmail 75994 invoked by uid 500); 2 Jul 2010 14:28:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 75608 invoked by uid 500); 2 Jul 2010 14:28:17 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 75451 invoked by uid 99); 2 Jul 2010 14:28:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Jul 2010 14:28:17 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sahmed1020@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Jul 2010 14:28:09 +0000 Received: by fxm1 with SMTP id 1so2586907fxm.31 for ; Fri, 02 Jul 2010 07:26:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=M7L2PZ82Fx/m4IHhg0Xgq5QWE7XfC7e86onet7t2zm8=; b=PDg0MENyD+wkWKuP5d40+TPiYmuoA0Dc2FBU+vtZGdmhAEo5JAEZoIM/P3wUjmONXO ZSih5jps8JyePhfXluXfUuvIqKQNPESG+LU2Qke0gYomqnqbZoJkKxqoiS1BhR4+mi2Y n+XvUi2J8+lZUrsjIS+mErDvp2jC/ux2FenQ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=aQCQHov9DKzeYYD1fdag/w3/Mk96nThOY6u0ygerKwkYlfJ3J5akFK/8jEO3nW0h/+ Rdt4K+QbiQmir8oPtAzz429f+WaFrEI5+jB2qa+EIs9U4g52VA6tf83z/bdaPWWmEtLV Hr4EBFCANFtpOVr00URh+IG79zb2CIIaL9czA= MIME-Version: 1.0 Received: by 10.86.90.3 with SMTP id n3mr1922376fgb.5.1278080809077; Fri, 02 Jul 2010 07:26:49 -0700 (PDT) Received: by 10.223.115.200 with HTTP; Fri, 2 Jul 2010 07:26:48 -0700 (PDT) In-Reply-To: References: Date: Fri, 2 Jul 2010 09:26:48 -0500 Message-ID: Subject: Re: facebook search index super column, do I have this correct? From: S Ahmed To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd246929e8a61048a686033 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd246929e8a61048a686033 Content-Type: text/plain; charset=ISO-8859-1 Actually I think in the video they said they store each messageID as a seperate column, that way they can do range queries correct? so it would be: aloha: { message1: "2343", message2: "9590002", ....} On Thu, Jul 1, 2010 at 6:25 PM, S Ahmed wrote: > So trying to map how facebook implemented a CF of type Super to index > message terms. > > Is this json representation correct? > > MessageIndex = { > > userid1 : { > > aloha : { messageIdList: > "234,2343234,23423434,234255,345345,2342,532432"}, > clown : { messageIdList: "632, 2342, 23452, 234234, 234234"}, > .. > .. > .. > }, > > userid2 : { > > eating : { messageIdList: > "234,2343234,23423434,234255,345345,2342,532432"}, > studying : { messageIdList: "632, 2342, 23452, 234234, 234234"}, > .. > .. > .. > > } > > } > > > So if a user searches for the term "clown", they you perform a lookup in > the CF named "MessageIndex", and use do a lookup for the row of the > currently logged in user by UserID (which is the key), and then look for a a > CF with the term "clown" and return the value. > > Is this a proper representation and am I using the correct terminology? > > > --000e0cd246929e8a61048a686033 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Actually I think in the video they said they store each messageID as a sepe= rate column, that way they can do range queries correct?

so it would= be:

aloha: { message1: "2343", message2: "9590002&qu= ot;, ....}

On Thu, Jul 1, 2010 at 6:25 PM, S Ahmed <sahmed1020@gmail.= com> wrote:
So trying to map how facebook implemented a CF of type Super to index messa= ge terms.

Is this json representation correct?

MessageIndex = =3D {

=A0=A0 userid1 : {

=A0=A0=A0 aloha : { messageIdList: &= quot;234,2343234,23423434,234255,345345,2342,532432"},
=A0=A0=A0 clown : { messageIdList: "632, 2342, 23452, 234234, 234234&q= uot;},
=A0=A0=A0 ..
=A0=A0=A0 ..
=A0=A0=A0 ..
=A0=A0 },

= =A0=A0 userid2 : {

=A0=A0 =A0=A0=A0 eating : { messageIdList: "= 234,2343234,23423434,234255,345345,2342,532432"},
=A0=A0=A0 studying : { messageIdList: "632, 2342, 23452, 234234, 23423= 4"},
=A0=A0=A0 ..
=A0=A0=A0 ..
=A0=A0=A0 ..

=A0=A0 }
}


So if a user searches for the term "clown", th= ey you perform a lookup in the CF named "MessageIndex", and use d= o a lookup for the row of the currently logged in user by UserID (which is = the key), and then look for a a CF with the term "clown" and retu= rn the value.

Is this a proper representation and am I using the correct terminology?=



--000e0cd246929e8a61048a686033--