incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blake Starkenburg <bstarkenb...@gmail.com>
Subject Is this the correct data model thinking?
Date Tue, 28 Feb 2012 06:40:09 GMT
Using a user/member as an example I am curious which of the data models
would be the best fit for performance and longevity of data in Cassandra?

Consider the simple staples of user/member details like
username,email,address,state,preferences,etc. Fairly simple, storing this
data into a row key users->username[email], etc.

Now as time goes on more data such as snapshot changes like
users->username['change:123456] = 'changed email', etc. columns compound
onto the users row-key. Perhaps more preferences are added onto the row-key
or login information. I wouldn't expect the amount of columns to grow
hugely, but I've also learned to plan for the un-expected...

Simplicity would tell me to:

A.) store ALL the data associated with the user onto a single users
row-key. Some user keys may be small, others may get larger over time
depending upon activity.

but would B be a better performance model

B.) Split out user data into seperate row-keys such as
users->changes_username['change123456] = 'changed email' AND
users->preferences_username['fav_color] = 'blue'. This would add a level of
complexity and in some cases tiny row-keys along with multiple fetches for
all user/member data?

Curious what your opinions are?

Thanks!

Mime
View raw message