incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: Multiget performance
Date Wed, 09 Apr 2014 22:52:34 GMT
Can you trace the query and paste the results?


On Wed, Apr 9, 2014 at 11:17 AM, Allan C <allanca@gmail.com> wrote:

> As one CQL statement:
>
>  SELECT * from Event WHERE key IN ([100 keys]);
>
> -Allan
>
> On April 9, 2014 at 12:52:13 AM, Daniel Chia (danchia@coursera.org) wrote:
>
> Are you making the 100 calls in serial, or in parallel?
>
> Thanks,
> Daniel
>
>
> On Tue, Apr 8, 2014 at 11:22 PM, Allan C <allanca@gmail.com> wrote:
>
>>  Hi all,
>>
>>  I've always been told that multigets are a Cassandra anti-pattern for
>> performance reasons. I ran a quick test tonight to prove it to myself, and,
>> sure enough, slowness ensued. It takes about 150ms to get 100 keys for my
>> use case. Not terrible, but at least an order of magnitude from what I need
>> it to be.
>>
>>  So far, I've been able to denormalize and not have any problems. Today,
>> I ran into a use case where denormalization introduces a huge amount of
>> complexity to the code.
>>
>>  It's very tempting to cache a subset in Redis and call it a day --
>> probably will. But, that's not a very satisfying answer. It's only about
>> 5GB of data and it feels like I should be able to tune a Cassandra CF to be
>> within 2x.
>>
>>  The workload is around 70% reads. Most of the writes are updates to
>> existing data. Currently, it's in an LCS CF with ~30M rows. The cluster is
>> 300GB total with 3-way replication, running across 12 fairly large boxes
>> with 16G RAM. All on SSDs. Striped across 3 AZs in AWS (hi1.4xlarges, fwiw).
>>
>>
>> Has anyone had success getting good results for this kind of workload?
>> Or, is Cassandra just not suited for it at all and I should just use an
>> in-memory store?
>>
>>  -Allan
>>
>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message