cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raj <rajkumar....@gmail.com>
Subject Is this a good schema design to implement a social application..
Date Fri, 07 Jan 2011 16:28:56 GMT
My question is in context of a social network schema design

I am thinking of following schema for storing a user's data that is
required as he logs in & is led to his homepage:-
(I aimed at a schema design such that through a single row read query
all the data that would be required to put up the homepage of that
user, is retreived.)

UserSuperColumnFamily: {    // Column Family

UserIDKey:
{columns:            MyName, MyEmail, MyCity,...etc
 supercolumns:    MyFollowersList, MyFollowiesList, MyPostsIdKeysList,
MyInterestsList, MyAlbumsIdKeysList, MyVideoIdKeysList,
RecentNotificationsForUserList,  MessagesReceivedList,
MessagesSentList, AccountSettingsList, RecentSelfActivityList,
UpdatesFromFollowiesList
}
}

Thus user's newfeed would be generated using superColumn:
UpdatesFromFollowiesList. But the UpdatesFromFollowiesList, would
obviously contain only Id of the posts and not the entire post data.

Questions:

1.) What could be the problems with this design, any improvements ?

2.) Would frequent & heavy overwrite operations/ row mutations (for
example; when propagating the post updates for news-feed from some
user to all his followies) which leads to rows ultimately being in
several SSTables, will lead to degraded read performance ?? Is it
suitable to use row Cache(too big row but all data required uptil user
is logged in) If I do not use cache, it may be very expensive to pull
the row each time a data is required for the given user since row
would be in several sstables. How can I improve the
read performance here

The actual data of the posts from network would be retrieved using
PostIdKey through subsequent read queries from columnFamily
PostsSuperColumnFamily which would be like follows:

PostsSuperColumnFamily:{

PostIdKey:
{
columns:            PostOwnerId, PostBody
supercolumns:   TagsForPost {list of columns of all tags for the
post}, PeopleWhoLikedThisPost {list of columns of UserIdKey of all the
likers}
}
}

Is this the best design to go with or are there any issues to consider
here ? Thanks in anticipation of your valuable comments.!

Mime
View raw message