incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ondřej Černoš <cern...@gmail.com>
Subject Re: Secondary Index on table with a lot of data crashes Cassandra
Date Thu, 25 Apr 2013 08:27:50 GMT
Hi,

if you are able to reproduce the issue, file a ticket on
https://issues.apache.org/jira/browse/CASSANDRA - my experience is
developers respond quickly on issues that are clearly a bug.

regards,

ondrej cernos


On Thu, Apr 25, 2013 at 10:03 AM, Tamar Rosen <tamar@correlor.com> wrote:

> Hi,
>
> We have a case of a reproducible crash, probably due to out of memory,
> but I don't understand why.
>
> The installation is currently single node.
>
> We have a column family with approx 50000 rows.
>
> In cql, the CF definition is:
>
> CREATE TABLE users (
>   user_name text PRIMARY KEY,
>   big_json text,
>   status int);
>
> Each big_json can have 500K or more of data.
>
>
>  There is also a secondary index on the status column.
>
>  Status can have various values, over 90% of all rows have status = 2.
>
>
> Calling:
>
> Select user_name from users limit 80000;
>
> Is pretty fast
>
>
> Calling:
>
> Select user_name from users where status = 1;
>
> is slower, even though much less data is returned.
>
>
>  Calling:
>
>  Select user_name from users where status = 2;
>
> Always crashes.
>
>
> What are we doing wrong? Can it be that Cassandra is actually trying to read all the
CF data rather than just the keys! (actually, it doesn't need to go to the users CF at all
- all the data it needs is in the index CF)
>
>  Also, in the code I am doing the same using Astyanax index query with pagination, and
the behavior is the same.
>
>
> Please help me:
>
> 1. solve the immediate issue
>
> 2. understand if there is something in this use case which indicates that we are not
using Cassandra the way it is meant.
>
>
> Thanks,
>
>
> Tamar Rosen
>
> Correlor.com
>
>
>
>

Mime
View raw message