incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: 10,000s of column families/keyspaces
Date Mon, 01 Jul 2013 16:24:03 GMT
We use playorm to do 80,000 virtual column families(a playorm feature though the pattern could
be copied).  We did find out later and we are working on this now that we wanted to map 80,000
virtual CF's into 10 real CF's so leveled compaction can run more in parallel though or else
we get stuck with single threaded LCS at the last tier which can take a while.  We are about
to map/reduce our dataset into our newest format.


From: Kirk True <<>>
Reply-To: "<>" <<>>
Date: Monday, July 1, 2013 10:19 AM
To: "<>" <<>>
Subject: 10,000s of column families/keyspaces

Hi all,

I know it's an old topic, but I want to see if anything's changed on the number of column
families that C* supports, either in 1.2.x or 2.x.

For a number of reasons [1], we'd like to support multi-tenancy via separate column families.
The "problem" is that there are around 5,000 tenants to support and each one needs a small
handful of column families each.

The last I heard C* supports 'a couple of hundred' column families before things start to
bog down.

What will it take for C* to support 50,000 column families?

I'm about to dive into the code and run some tests, but I was curious about how to quantify
the overhead of a column family. Is the reason performance? Memory? Does the off-heap work
help here?


[1] The main three reasons:

 1.  ability to wholesale drop data for a given tenant via drop keyspace/drop CFs
 2.  ability to have divergent schema for each tenant (partially effected by DSE Solr integration)
 3.  secondary indexes per tenant (given requirement #2)

View raw message