cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gabriele renzi <rff....@gmail.com>
Subject Re: If a user has millions of followers, is there millions of iterate? (ref Twissandra)
Date Thu, 15 Apr 2010 08:11:02 GMT
On Thu, Apr 15, 2010 at 9:56 AM, Allen He <allenhooo@gmail.com> wrote:
> Hello folks,
>
> When Twissandra (Twitter clone example for Cassandra) post a tweet, it
> iterate all of the followers to insert a tweet_id to their time lines(see


>     for follower_id in follower_ids:
>         TIMELINE.insert(str(follower_id), {ts: str(tweet_id)})
>
>
>
> My question is, If a user has millions of followers, is there millions of
> iterate?

I never looked at the twissandra code but it looks like that. It is
probably a trade off: either you store the tweets in each timeline and
when a user wants to read them you fetch them all (so putting the
burden on read time) or you do it like this and put it on the write.
Since writes are cheap in cassandra, and reads are more frequents,
this seems to make sense.


PS
  I think it should use batch_mutate anyway so that only one operation
is sent over the network

Mime
View raw message