incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: Data Model - Additional Column Families or one CF?
Date Tue, 26 Feb 2013 15:33:20 GMT
Oh, and 50 CF's should be fine.

Dean

From: Javier Sotelo <javier.a.sotelo@gmail.com<mailto:javier.a.sotelo@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 26, 2013 12:27 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Data Model - Additional Column Families or one CF?

Aaron,

Would 50 CFs be pushing it? According to http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management,
"This has been tested to work across hundreds or even thousands of ColumnFamilies."

What is the bottleneck, IO?

Thanks,
Javier


On Sun, Feb 24, 2013 at 5:51 PM, Adam Venturella <aventurella@gmail.com<mailto:aventurella@gmail.com>>
wrote:

Thanks Aaron, this was a big help!

—
Sent from Mailbox<https://bit.ly/SZvoJe> for iPhone



On Thu, Feb 21, 2013 at 9:27 AM, aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
wrote:

If you have a limited / known number (say < 30)  of types, I would create a CF for each
of them.

If the number of types is unknown or very large I would have one CF with the row key you described.

Generally I avoid data models that require new CF's as the data grows. Additionally having
different CF's allows you to use different cache settings, compactions settings and even storage
mediums.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/02/2013, at 7:43 AM, Adam Venturella <aventurella@gmail.com<mailto:aventurella@gmail.com>>
wrote:

My data needs only require me to store JSON, and I can handle this in 1 column family by prefixing
row keys with a type, for example:

comments:{message_id}

Where comments: represents the prefix and {message_id} represents some row key to a message
object in the same column family.

In this case comments:{message_id} would be a wide row using comment creation time and descending
clustering order to sort the messages as they are added.

My question is, would I be better off splitting comments into their own Column Family or is
storing them in with the Messages Column Family sufficient, they are all messages after all.

Or do Column Families really just provide a nice organizational front for data. I'm just storing
JSON.






Mime
View raw message