Hi,

this is true for CQL2, it doesn't work for CQL3:

cqlsh:c4> SELECT id from some_table WHERE indexed_column='test';
...
cqlsh:c4> SELECT KEY from some_table WHERE indexed_column='test';
Bad Request: Undefined name key in selection clause
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

regards,
Ondřej Černoš

On Thu, Apr 25, 2013 at 10:32 AM, <moshe.kranc@barclays.com> wrote:

IMHO: user_name is not a column, it is the row key. Therefore, according to http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ , the row does not contain a relevant column index, which causes the iterator to read each column (including value) of each row.

 

I believe that instead of referring to user_name as if it were a column, you need to refer to it via the reserved word “KEY”, e.g.:

 

Select KEY from users where status = 2; 

 

Always glad to share a theory with a friend….

 

 

From: Tamar Rosen [mailto:tamar@correlor.com]
Sent: Thursday, April 25, 2013 11:04 AM
To: user@cassandra.apache.org
Subject: Secondary Index on table with a lot of data crashes Cassandra

 

Hi,

 

We have a case of a reproducible crash, probably due to out of memory, but I don't understand why. 

 

The installation is currently single node. 

 

We have a column family with approx 50000 rows. 

 

In cql, the CF definition is:

 

 
CREATE TABLE users (
  user_name text PRIMARY KEY,
  big_json text,
  status int
);
 
Each big_json can have 500K or more of data.
 
There is also a secondary index on the status column. 
Status can have various values, over 90% of all rows have status = 2. 
 
 
Calling:
 
Select user_name from users limit 80000;
 
Is pretty fast
 
 
 
Calling:
 
Select user_name from users where status = 1; 
is slower, even though much less data is returned.
 
Calling:
 
Select user_name from users where status = 2; 
 
Always crashes.
 
 
What are we doing wrong? Can it be that Cassandra is actually trying to read all the CF data rather than just the keys! (actually, it doesn't need to go to the users CF at all - all the data it needs is in the index CF)
 
 
Also, in the code I am doing the same using Astyanax index query with pagination, and the behavior is the same. 


Please help me:
 
1. solve the immediate issue
 
2. understand if there is something in this use case which indicates that we are not using Cassandra the way it is meant. 
 


Thanks,
 


Tamar Rosen
 
Correlor.com
 


 

_______________________________________________

This message may contain information that is confidential or privileged. If you are not an intended recipient of this message, please delete it and any attachments, and notify the sender that you have received it in error. Unless specifically stated in the message or otherwise indicated, you may not duplicate, redistribute or forward this message or any portion thereof, including any attachments, by any means to any other person, including any retail investor or customer. This message is not a recommendation, advice, offer or solicitation, to buy/sell any product or service, and is not an official confirmation of any transaction. Any opinions presented are solely those of the author and do not necessarily represent those of Barclays. This message is subject to terms available at: www.barclays.com/emaildisclaimer and, if received from Barclays' Sales or Trading desk, the terms available at: www.barclays.com/salesandtradingdisclaimer/. By messaging with Barclays you consent to the foregoing. Barclays Bank PLC is a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP. This email may relate to or be sent from other members of the Barclays group.

_______________________________________________