incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Martin <moyesys...@googlemail.com>
Subject Re: OutOfMemory on count on cassandra 0.6.8 for large number of columns
Date Sun, 12 Dec 2010 08:07:27 GMT
Thanks Tyler. I was unaware of counters.

The use case for column counts is really from a operational perspective,
to allow a sysadmin to do adhoc checks on columns to see if something
has gone wrong in software outside of cassandra.

I think running a cassandra-cli command such as count, which makes
cassandra fall over is not ideal,
unless we can say for X number of columns cassandra needs at least Y
memory allocation for stability.

Cheers

Dave


On Sun, Dec 12, 2010 at 6:39 PM, Tyler Hobbs <tyler@riptano.com> wrote:
> Cassandra has to deserialize all of the columns in the row for get_count().
> So from Cassandra's perspective, it's almost as much work as getting the
> entire row, it just doesn't have to send everything back over the network.
>
> If you're frequently counting 8 million columns (or really, anything
> significant), you need to use counters instead.  If this is a rare
> occurrence, you can do the count in multiple chunks by using a starting and
> ending column in the SlicePredicate for each chunk, but this requires some
> rough knowledge about the distribution of the column names in the row.
>
> - Tyler

Mime
View raw message