The secondary index is getting scanned since you put the column in your query. The behavior you are looking for is a coming feature called Global Indexes slated for 3.0. https://issues.apache.org/jira/browse/CASSANDRA-6477

In the meantime, you could build your own lookup table even with this low of cardinality. If the point is to find everyone of a certain gender in a company, give this a try.

create table company_gender (
   company_id uuid,
   gender text,
   person_id uuid,
   PRIMARY KEY (company_id, gender)

Each company would be a partition and you could find all males or females with a single query. The bonus is that you would get paging which will be much more efficient. 


On Fri, Mar 6, 2015 at 2:56 PM, Jimmy Lin <y2klyf+work@gmail.com> wrote:
Ran into RPC timeout exception when execution a query that involve secondary index of a Boolean column when for example the company has more than 1k person.

select * from company where company_id=xxxx and isMale = true;

such extreme low cardinality of secondary index  like the other docs stated, will result in basically 2 large row those values. However, I thought since I also bounded the query with my primary partition key, won't that be first consulted and then further narrow down the result and be efficient?

Also, if I simply do
select * from company where company_id=xxxx ;
(without the AND clause on secondary index, it return right away)

Or mayb Cassandra server internal always parsing the secondary index result first?


I have a simple table

create table company {
company_id uuid,
person_id uuid,
isMale Boolean,
PRIMARY KEY (company_id, person_id)