incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Chia <danc...@coursera.org>
Subject Re: Multiget performance
Date Wed, 09 Apr 2014 07:51:16 GMT
Are you making the 100 calls in serial, or in parallel?

Thanks,
Daniel


On Tue, Apr 8, 2014 at 11:22 PM, Allan C <allanca@gmail.com> wrote:

> Hi all,
>
> I've always been told that multigets are a Cassandra anti-pattern for
> performance reasons. I ran a quick test tonight to prove it to myself, and,
> sure enough, slowness ensued. It takes about 150ms to get 100 keys for my
> use case. Not terrible, but at least an order of magnitude from what I need
> it to be.
>
> So far, I've been able to denormalize and not have any problems. Today, I
> ran into a use case where denormalization introduces a huge amount of
> complexity to the code.
>
> It's very tempting to cache a subset in Redis and call it a day -- probably
> will. But, that's not a very satisfying answer. It's only about 5GB of data
> and it feels like I should be able to tune a Cassandra CF to be within 2x.
>
> The workload is around 70% reads. Most of the writes are updates to
> existing data. Currently, it's in an LCS CF with ~30M rows. The cluster is
> 300GB total with 3-way replication, running across 12 fairly large boxes
> with 16G RAM. All on SSDs. Striped across 3 AZs in AWS (hi1.4xlarges, fwiw).
>
>
> Has anyone had success getting good results for this kind of workload? Or,
> is Cassandra just not suited for it at all and I should just use an
> in-memory store?
>
> -Allan
>

Mime
View raw message