cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: select many rows one time or select many times?
Date Thu, 31 Jul 2014 23:58:50 GMT
This doesn’t seem like a reasonable use case for Cassandra. I mean, it’s not a typical
“database” use case.

-- Jack Krupansky

From: Philo Yang 
Sent: Thursday, July 31, 2014 1:44 PM
To: user@cassandra.apache.org 
Subject: select many rows one time or select many times?

Hi all, 

I have a cluster of 2.0.6 and one of my tables is like this:
CREATE TABLE word (
  user text,
  word text,
  flag double,
  PRIMARY KEY (user, word)
)

each "user" has about 10000 "word" per node. I have a requirement of selecting all rows where
user='someuser' and word is in a large set whose size is about 1000 . 

In C* document, it is not recommended to use "select ... in" just like:

select from word where user='someuser' and word in ('a','b','aa','ab',...) 

So now I select all rows where user='someuser' and filtrate them via client rather than via
C*. Of course, I use Datastax Java Driver to page the resultset by setFetchSize(1000).  Is
it the best way? I found the system's load is high because of large range query, should I
change to select for only one row each time and select 1000 times?

just like:
select from word where user='someuser' and word = 'a';
select from word where user='someuser' and word = 'b';

select from word where user='someuser' and word = 'c';

.....

Which method will cause lower pressure on Cassandra cluster?

Thanks, 
Philo Yang

Mime
View raw message