incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Bailey <>
Subject Re: Creating column families per client
Date Wed, 21 Dec 2011 16:50:16 GMT
The overhead for column families was greatly reduced in 0.8 and 1.0.
It should now be possible to have hundreds or thousands of column
families. The setting 'memtable_total_space_in_mb' was introduced that
allows for a global memtable threshold, and cassandra will handle
flushing on its own.


Another thing you should consider is the lack of built in access
controls. There is an authentication/authorization interface you can
plug in to and examples in the examples/ directory of the source

On Wed, Dec 21, 2011 at 10:36 AM, Ryan Lowe <> wrote:
> What we have done to avoid creating multiple column families is to sort of
> namespace the row key.  So if we have a column family of Users and accounts:
> "AccountA" and "AccountB", we do the following:
> Column Family User:
>    "AccountA/ryan" : { first: Ryan, last: Lowe }
>    "AccountB/ryan" : { first: Ryan, last: Smith}
> etc.
> For our needs, this did the same thing as having 2 "User" column families
> for "AccountA" and "AccountB"
> Ryan
> On Wed, Dec 21, 2011 at 10:34 AM, Flavio Baronti <>
> wrote:
>> Hi,
>> based on my experience with Cassandra 0.7.4, i strongly discourage you to
>> do that: we tried dynamical creation of column families, and it was a
>> nightmare.
>> First of all, the operation can not be done concurrently, therefore you
>> must find a way to avoid parallel creation (over all the cluster, not in a
>> single node).
>> The main problem however is with timestamps. The structure of your
>> keyspace is versioned with a time-dependent id, which is assigned by the
>> host where you perform the schema update based on the local machine time. If
>> you do two updates in close succession on two different nodes, and their
>> clocks are not perfectly synchronized (and they will never be), Cassandra
>> might be confused by their relative ordering, and stop working altogether.
>> Bottom line: don't.
>> Flavio
>> Il 12/21/2011 14:45 PM, Rafael Almeida ha scritto:
>>> Hello,
>>> I am evaluating the usage of cassandra for my system. I will have several
>>> clients who won't share data with each other. My idea is to create one
>>> column family per client. When a new client comes in and adds data to the
>>> system, I'd like to create a column family dynamically. Is that reliable?
>>> Can I create a column family on a node and imediately add new data on that
>>> column family and be confident that the data added will eventually become
>>> visible to a read?
>>> []'s
>>> Rafael

View raw message