incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Hsu <pe...@motivecast.com>
Subject Column or SuperColumn
Date Wed, 02 Jun 2010 02:33:22 GMT
I have a pretty simple data modeling question.  I don't know whether or not to use a CF or
SCF in one instance.

Here's my example.  I have an Store entry and locations for each store.  So I have something
like:

Using CF:
Store { //CF
   storeId { //row key
      storeName:str,
      storeLogo:image
   }
   storeId:locationId1 {
      locationName:str,
      latLong:coordinate
   }
   storeId:locationId2 {
      locationName:str,
      latLong:coordinate
   }
}

Using SCF:
Store { //SCF
   storeId { //row key
      store {
          storeName:str,
          storeLogo:image
      }
      locationId1 {
          locationName:str,
          latLong:coordinate
      }
      locationId2 {
          locationName:str,
          latLong:coordinate
      }
   }
}

Queries:

Reads:
 1. Read store and all locations (could be done by range query efficiently when using CF,
since I'm using OPP)
 2. Read only a particular location of a store (don't need the store meta data here)
 3. Read only store name info (don't need any location info here)

Writes:
 1. Update store meta data (without touching location info)
 2. Update location data for a store (without touching rest of store data)
 3. Add a new location to an existing store (would have a unique identifier for location,
no worries about having to do a read..)

I read that SuperColumns are not as fast as Columns, and obviously you can't have indexed
subcolumns of supercolumns, but in this case I don't need the subsubcolumn indices.  It seems
cleaner to model it as a SuperColumn, but why would I want to pay a performance penalty instead
of just concating my keys.

This seems like a fairly common pattern?  What's the rule to decide between CF and SCF?

Thanks,
Peter
Mime
View raw message