incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Two column families or One super column family?
Date Tue, 29 Mar 2011 22:14:53 GMT
I would go with the solution that means you only have to make one request to serve your reads,
so consider the super CF approach. 

There are some downsides to super columns see http://wiki.apache.org/cassandra/CassandraLimitations
and they tend to have a love-them-hate-them reputation.

One thing to consider is that you do not need to model every attribute of your entity as a
column in cassandra. Especially if you are always going to pull back all the attributes. So
you could do your super CF approach with a standard CF, just pack the columns into some sort
of structure such as JSON and store them as a blob. 

Or you can use a naming scheme in the column names with a standard CF, e.g. uuid1.text and
uuid2.text 

Hope that helps. 
Aaron

On 30 Mar 2011, at 01:05, T Akhayo wrote:

> Good afternoon,
> 
> I'm making my data model from scratch for cassandra, this means i can tune and fine tune
it for performance.
> 
> At this time i'm having problems choosing between a 2 column families or 1 super column
family. I will illustrate with a example.
> 
> Sector, this defines a place, this is one or two properties.
> Entry, a entry that is bound to a sector, this is simply some text and a few properties.
> 
> I can model this with a super column family:
> 
> sectors{ //super column family
> sector1{
> uid1{
> text: a text
> user: joop
> }
> uid2{
> text: more text
> user: piet
> }
> }
> sector2{
> uid10{
> text: even more text
> user: marie
> }
> }
> }
> 
> But i can also model this with 2 column families:
> 
> sectors{ // column family
> sector1{
> textid1: null
> textid2: null
> }
> sector2{
> textid4: null
> }
> }
> 
> texts{ //column family
> textid1{
> text: a text
> user: joop
> }
> textid2{
> text: more text
> user: piet
> }
> }
> 
> With the super column family i can retrieve a list of texts for a specific sector with
only 1 request to cassandra.
> 
> With the 2 column families i need to send 2 requests to cassandra:
> 1. give me all textids from sector x. (returns x, y, z)
> 2. give me all texts that have id x, y, z.
> 
> In my final application it is likely that there will be a bit more writes compared to
reads.
> 
> I was wondering what the best approach is when it comes to performance. I suspect that
using super column families is slower compared the using column families, but is it stil slower
when using 2 column families and with 2 request to cassandra instead of 1 (with super column
family).
> 
> Kind regards,
> T. Akhayo


Mime
View raw message