incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Write latency of counter updates across multiple rows
Date Sun, 05 Feb 2012 20:56:47 GMT
I'm not thinking about counters specifically here, and assuming you are sending batch mutations
of the same sizeā€¦ 

The mutations (inserts, counter increments) for a row are turned into a single task server
side, and are then processed in a serial fashion. If you send a mutation for 2 rows it will
be turned into two tasks, which can then be processed in parallel. 

There is an point of dimensioning returns here. Each row you write to or read from will become
a task, if you write to 1,000 rows at once you will put 1,000 tasks in the thread pool which
typically has 32 concurrent threads. This may block / add latency to other requests. It's
more of an issue with reads than writes. 

Does that apply to your situation ? 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/02/2012, at 1:19 AM, Amit Chavan wrote:

> 
> Hi,
> 
> In our use case, we maintain minute-wise roll ups for different metrics. These are stored
in a counter column family where the row key is a composite containing the timestamp rounded
to the last minute and an integer between 0-9 (This integer is calculated as the MD5 hash
of the metric mod 10). The column names are the metrics we wish to track. Typically, each
row has about 100,000 counters.
> 
> We tested two scenarios. The first one is as mentioned above. In this case we got a per
write latency of about 80 micro-seconds to 100 micro-seconds.
> 
> In the other scenario, we calculated the integer in the row key as mod 100. In this case
we observed a per write latency of 50 micro-seconds to 70 micro-seconds.
> 
> I wish to understand why updates to counters were faster as they got spread across multiple
rows?
> 
> Cluster summary : 4 nodes running Cassandra 1.0.5. Each with 8 cores, 32G RAM, 10G Cassandra
heap. We are using replication factor of 2.
> 
> 
> -- 
> Thanks!
> Amit Chavan
> 


Mime
View raw message