incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lenin Gali <>
Subject Re: Digg's data model
Date Sat, 20 Mar 2010 08:53:47 GMT
I have several questions. I hope some of you can share your experiences in
each or all of these following. I will be curious about twitter and digg's
experience as they might be processing

1. Eventual consistency: Given a volume of 5K writes / sec and roughly 1500
writes are Updates per sec while the rest are inserts, what kind of latency
can be expected in eventual consistency?

2. Performance: Are there any bench marks on how many writes /sec and
reads/sec cassandra supports on an "n node" cluster? a Node can be of
variable size and would like to know the hardware/software details of the
cluster as well.

3. EC2: Has any one implemented cassandra on EC2 and what kind transaction
volume are they using it for and how is their experience with cassandra on

4. Overhead and issues: What are typical nightmare scenario's one could face
when using Cassandra for heavy write / read intensive systems?

5. Backups : If there is a  4 or 5 TB cassandra cluster what do you
recommend the backup scenario's could be?

Also, Does cassandra support counters? Digg's article said they are going to
contribute their work to open source any idea when that would be?

Thanks in advance for sharing your experience


On Fri, Mar 19, 2010 at 1:03 PM, Jonathan Ellis <> wrote:

> Jeff Hodsdon edited the new link in:
> On Fri, Mar 19, 2010 at 2:49 PM, Nathan McCall <>
> wrote:
> > Gary,
> > Did you see this larticle linked from the Cassandra wiki?
> >
> >
> > See for more
> > examples like the above. In general, you structure your data according
> > to how it will be queried. This can lead to duplication, but that is
> > one of the trade-offs for performance and scale.
> >
> > Digg folks - the "Looking to the Future with Cassandra" linked on the
> > wiki is no longer available. I found that article quite helpful
> > originally. Is there a chance this could be re-posted?
> >
> > Cheers,
> > -Nate
> >
> > On Fri, Mar 19, 2010 at 12:16 PM, Gary <> wrote:
> >> I am a newbie to bigtable like model and have a question as follows.
> Take
> >> Digg as an example, I want to find a list users who dug a URL and also
> want
> >> to find a list of URLs a user dug. How should the data model look like
> for
> >> the queries to be efficient? If I use the username and the URL for two
> rows,
> >> when a user digs a URL, I will have to update two rows so I need a
> >> transaction to keep data consistent.
> >> Any thoughts?
> >> Thanks,
> >> Gary
> >

twitter: leningali
skype: galilenin

View raw message