Yes.  Think in queries.
Break your normalization habit
Roughly ~one CF per query
Denormalize!
Use in-column entity caching


On Tue, Feb 28, 2012 at 12:12 AM, aaron morton <aaron@thelastpickle.com> wrote:
A.) store ALL the data associated with the user onto a single users row-key. Some user keys may be small, others may get larger over time depending upon activity.
I would go with this.
The important thing is supporting the read queries. 

Cheers
Aaron

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 28/02/2012, at 7:40 PM, Blake Starkenburg wrote:

Using a user/member as an example I am curious which of the data models would be the best fit for performance and longevity of data in Cassandra?

Consider the simple staples of user/member details like username,email,address,state,preferences,etc. Fairly simple, storing this data into a row key users->username[email], etc.

Now as time goes on more data such as snapshot changes like users->username['change:123456] = 'changed email', etc. columns compound onto the users row-key. Perhaps more preferences are added onto the row-key or login information. I wouldn't expect the amount of columns to grow hugely, but I've also learned to plan for the un-expected...

Simplicity would tell me to:

A.) store ALL the data associated with the user onto a single users row-key. Some user keys may be small, others may get larger over time depending upon activity.

but would B be a better performance model

B.) Split out user data into seperate row-keys such as users->changes_username['change123456] = 'changed email' AND users->preferences_username['fav_color] = 'blue'. This would add a level of complexity and in some cases tiny row-keys along with multiple fetches for all user/member data?

Curious what your opinions are?

Thanks!


-- 
Thanks,

Charlie (Yi) Zhu (一个 木匠)
=======
Data Solution Architect Developer
http://mujiang.blogspot.com