Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 70944 invoked from network); 2 Sep 2010 16:31:15 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Sep 2010 16:31:15 -0000 Received: (qmail 6653 invoked by uid 500); 2 Sep 2010 16:31:13 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 6596 invoked by uid 500); 2 Sep 2010 16:31:13 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 6588 invoked by uid 99); 2 Sep 2010 16:31:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Sep 2010 16:31:12 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: unknown amxip4:204.200.197.195ip4:174.37.77.13ip4:174.37.77.14ip4:174.37.77.15ip4:67.228.191.123ip4:67.228.190.12ip4:174.36.43.26ip4:67.228.190.15ip4:204.200.197.196~all (athena.apache.org: encountered unrecognized mechanism during SPF processing of domain of cassandra@softwareprojects.com) Received: from [204.200.197.196] (HELO mx1.softwareprojects.com) (204.200.197.196) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Sep 2010 16:31:06 +0000 Received: from [192.168.0.132] (bzq-84-109-23-242.red.bezeqint.net [84.109.23.242]) (authenticated bits=0) by mx1.softwareprojects.com (8.13.6.20060614/8.13.6) with ESMTP id o82GUg3K000911 for ; Thu, 2 Sep 2010 16:30:44 GMT Message-ID: <4C7FD12B.50004@softwareprojects.com> Date: Thu, 02 Sep 2010 12:30:35 -0400 From: Mike Peters User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Looking for something like "like" of mysql. References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------080709090900090106080808" This is a multi-part message in MIME format. --------------080709090900090106080808 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cassandra doesn't support adhoc queries, like what you're describing I recommend looking at Lucandra On 9/2/2010 12:27 PM, Anuj Kabra wrote: > I am working with cassandra-0.6.4. I am working on mail retreival > problem. We have the metadata of mail like sender, recipient, > timestamp, subject and the location of mail file stored in a cassandra > DB.Everyday about 25,000 records will > > be entered to this DB. We have not finalised on the data model yet but > starting with a simple one having only one column family. > > which have user_id of recipient as key.and columns for sender_id, > timestamp of mail, subject and location of mail file. > Now our Use case is to get the locations of all mail files which are > being sent by a user matching a given subject(can be a part of the > original subject of mail). Well according to my knowledge till now, we > can get all the rows of a user > > by using user_id as key. After that i need to iterate over all the > rows i get and see which mail seems to fit the given > condition.(matching a subject in this case), which is very heavy > computationally as we would get thousands of rows. > So we are looking for something like "like" of mysql provided by > thrift. I also need to know if am going the right way. > Help is much appreciated. > --------------080709090900090106080808 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cassandra doesn't support adhoc queries, like what you're describing

I recommend looking at Lucandra

On 9/2/2010 12:27 PM, Anuj Kabra wrote:
I am working with cassandra-0.6.4. I am working on mail retreival problem. We have the metadata of mail like sender, recipient, timestamp, subject and the location of mail file stored in a cassandra DB.Everyday about 25,000 records will

be entered to this DB. We have not finalised on the data model yet but starting with a simple one having only one column family.
<ColumnFamily name="MailMetadata" CompareWith="UTF8Type">
which have user_id of recipient as key.and columns for sender_id, timestamp of mail, subject and location of mail file.
Now our Use case is to get the locations of all mail files which are being sent by a user matching a given subject(can be a part of the original subject of mail). Well according to my knowledge till now, we can get all the rows of a user

by using user_id as key. After that i need to iterate over all the rows i get and see which mail seems to fit the given condition.(matching a subject in this case), which is very heavy computationally as we would get thousands of rows.
So we are looking for something like "like" of mysql provided by thrift. I also need to know if am going the right way.
Help is much appreciated.


--------------080709090900090106080808--