Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 92573 invoked from network); 22 Jul 2010 08:37:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Jul 2010 08:37:57 -0000 Received: (qmail 78361 invoked by uid 500); 22 Jul 2010 08:37:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 78128 invoked by uid 500); 22 Jul 2010 08:37:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 78120 invoked by uid 99); 22 Jul 2010 08:37:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Jul 2010 08:37:51 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_FRT_BELOW2,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of chen.daqi@gmail.com designates 209.85.212.172 as permitted sender) Received: from [209.85.212.172] (HELO mail-px0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Jul 2010 08:37:44 +0000 Received: by pxi20 with SMTP id 20so4370662pxi.31 for ; Thu, 22 Jul 2010 01:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=ySODrcXNN3fP+a2D+Jeeg0PvY86niJYz4ErInI4RsBY=; b=PDpP/j+qC8Pq5N9RJFbCrZ6XgNwc23jmZTa/MpPSPGJeu3BdWm9LAwF+ibOJPxIO3R gh3DSCsguAYBV5W97MhWEWdDfb/4uS3DDIAp41zHpoE5JDF4IauIux9ZJn4DLe4RUBHB U/kCh9VLW4Lte4D6wdo96ql0E7LeY9buqwKoE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=mlu8NVfBs0nci84KLYYSBwycurh3WkPVm/Xzoqd0FPpGy/npfp0UaAUZIJ1BrJEgle 3pUJIcH6z/5cJAspGyGpvjkNwWK33ds939qh/XverCW8PBet+b6yLP+ABvOw9hlhoEnA JCdsl1OL2D0ISlYYoTg+L4hRjqNyZlIZMRo4g= MIME-Version: 1.0 Received: by 10.142.213.14 with SMTP id l14mr1927769wfg.90.1279787842536; Thu, 22 Jul 2010 01:37:22 -0700 (PDT) Received: by 10.142.210.14 with HTTP; Thu, 22 Jul 2010 01:37:22 -0700 (PDT) In-Reply-To: References: Date: Thu, 22 Jul 2010 16:37:22 +0800 Message-ID: Subject: Re: goods search with cassandra From: Chen Xinli To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd32d3abe0681048bf5d3c9 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd32d3abe0681048bf5d3c9 Content-Type: text/plain; charset=ISO-8859-1 Thanks for your suggestion. Does it work if insertion through thrift client, and reading through cassandra directly like ClientOnlyExample? 2010/7/21 Santal Li > I think build a ColumnValueFilter isn't a good idea, you really needs was a > self defined index, otherwise filter will cause too many scan and disk IO. > > we have meet almost same problem as yours in our own webapp: store data in > one fields, then get data by search on another fields. Our solution is > create a new KeySpace for index, them maintains the index by query > conditions at application. Suggest you read this document, for get > basic idea > http://code.google.com/intl/zh-CN/appengine/articles/index_building.html . > > if you using this solution, maybe you need consider bellow issue: > 1. multi client concurrent access > 2. index and object data maybe inconsistence during error. > > Some kind of lock service maybe help, like ZooKeeper. > > Regards > -Santal > > > > 2010/7/19 Chen Xinli > > Hi, >> >> I want to implement goods search with cassandra; and I have some >> confusings. Can someone help me out? >> >> The case is that: >> There are about 1 million shops, every shop with about 10,000 goods, every >> goods with property like "title", "price" etc.. >> The search is like "give me 10 goods in a specific shop and the price of >> the goods should be less than 10$" >> >> For the data model, I use shop name as the key; goods id as the column >> name and "title", "price" are special encoded as column value . >> There are too many goods in one shop, filtering the data in thrift client >> is impossible for network transferring reason. >> I want to implement a special ColumnValueFilter extends QueryFilter to get >> the result in "local". >> Is this the best way? >> >> >> Insertion of goods is about 100/second for the whole cluster, so a thrift >> client for insertion is ok. >> For reads, latency and qps are important and I must provide a http service >> for user searching. >> Embedding a thrift client in such a service will involve another network >> transferring, so I want to build the service on top of cassandra directly. >> I reviewed the code of ClientOnlyExample.java. >> What makes me confusing is that: insertion through thrift client and >> reading through using cassandra directly, is data consistency promised and >> how? >> >> Any help is appreciated. Thanks! >> >> -- >> Best Regards, >> Chen Xinli >> > > -- Best Regards, Chen Xinli --000e0cd32d3abe0681048bf5d3c9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks for your suggestion.

Does it work if insertion through thrift= client, and reading through cassandra directly like ClientOnlyExample?
=
2010/7/21 Santal Li <santal.li@gmail= .com>
I think buil= d a ColumnValueFilter=A0isn't a good idea, you really=A0needs=A0was a s= elf defined index, otherwise filter will cause too many scan and disk IO.
=A0
we have meet almost same problem as yours in our own webapp: store dat= a in one fields,=A0then get data=A0by search=A0on another fields. Our solut= ion is create a new KeySpace for index, them maintains the index by query c= onditions at application. Suggest you read this document,=A0for get basic= =A0idea http://code.google.com/intl/zh-CN/appen= gine/articles/index_building.html=A0.
=A0
if you using this solution, maybe you need consider bellow issue:
1. multi client concurrent access
2. index and object data maybe inconsistence during error.
=A0
Some kind of lock service maybe help, like ZooKeeper.
=A0
Regards
-Santal


=A0
2010/7/19 Chen Xinli <chen.daqi@gmail.com= >

Hi,

I wan= t to implement goods search with cassandra; and I have some confusings. Can= someone help me out?

The case is that:
There are about 1 million shops, every shop with a= bout 10,000 goods, every goods with property like "title", "= price" etc..
The search is like "give me 10 goods in a specifi= c shop and the price of the goods should be less than=A0 10$"

For the data model, I use shop name as the key; goods id = as the column name and "title", "price" are special enc= oded as column value .
There are too many goods in one shop, filtering t= he data in thrift client is impossible for network transferring reason. I want to implement a special ColumnValueFilter extends QueryFilter to get = the result in "local".
Is this the best way?


Inser= tion of goods is about 100/second for the whole cluster, so a thrift client= for insertion is ok.
For reads, latency and qps are important and I must provide a http service = for user searching.
Embedding a thrift client in such a service will in= volve another network transferring, so I want to build the service on top o= f cassandra directly.
I reviewed the code of ClientOnlyExample.java.
What makes me confusing = is that: insertion through thrift client and reading through using cassandr= a directly, is data consistency promised and how?
=A0
Any help is app= reciated. Thanks!

--
Best Regards,
Chen Xinli




--
Best Regards,
Chen X= inli
--000e0cd32d3abe0681048bf5d3c9--