incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Cassandra adding 500K + Super Column Family
Date Tue, 16 Aug 2011 23:20:03 GMT
Are you planning to create 500,000 Super Column Families or 500,000 rows in a single Super
Column Family ? 

The former is a somewhat crazy. Cassandra schemas typically have up to a few tens of Column
Families. Each column family involves a certain amount of memory overhead, this is now automatically
managed in Cassandra 0.8 (see

if I understand correctly you have 500K entities with 6k columns each. A simple first approach
to modelling this would be to use a Standard CF with a row for each entity. However the best
model is the one that serves your read requests best. 

Also for background the sub columns in a super column are not indexed see
. You would probably run into this problem if you had 6000 sub columns in a super column.

Hope that helps. 

Aaron Morton
Freelance Cassandra Developer

On 17/08/2011, at 12:53 AM, Renato Bacelar da Silveira wrote:

> I am wondering about a certain volume situation.
> I currently load a Keyspace with a certain amount of SCFs.
> Each SCF (Super Column Family) represents an entity.
> Each Entity may have up to 6000 values.
> I am planning to have 500,000 Entities (SCF) with
> 6000 Columns (within Super Columns - number of Super Columns
> unknown), and was wondering how much resources something
> like this would require?
> I am struggling to have 10,000 SCF with 30 Columns (within SuperColumns),
> I get very large files, and reach a 4Gb heapspace limit very quickly on
> a single node. I use Garbage Collection where needed.
> Is there some secret to load 500,000 Super Column Families?
> Regards.
> -- 
> Renato da Silveira
> Senior Developer

View raw message