cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Viner <>
Subject cassandra as user-profile data store
Date Wed, 23 Feb 2011 23:21:07 GMT
Hi all,

I'm wondering if anyone has used cassandra as a datastore for a user-profile
service.  I'm thinking of applications like behavioral targeting, where
there are lots & lots of users (10s to 100s of millions), and lots & lots of
data about them intermixed in, say, weblogs (probably TBs worth).  The idea
would be to use Cassandra as a datastore for distributed parallel processing
of the TBs of files (say on hadoop).  Then the resulting user-profiles would
be query-able quickly.

Anyone know of that sort of application of Cassandra?  I'm trying to puzzle
out just what the column family might look like.  Seems like a mix of
time-oriented information (user x visits site y at time z), location
information (user x appeared from ip x.y.z.a which is geo-location 31.20309,
120.10923), and derived information (because user x visited site y 15 times
within a 10 day window, user x must be interested in buying a car).

I don't have specifics as yet... just some general thoughts.  But this feels
like a Cassandra type problem.  (User profile can have lots of columns per
user, but the exact columns might differ from user to user... very scalable,

Dave Viner

View raw message