incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject batch mutates & throughput
Date Sun, 07 Aug 2011 21:06:29 GMT
A question regarding batch mutates and how others might be throttling the
system to prevent timeouts.

My 3-node, RF=3 cluster has been performing ok while bulk loading data
(applying counter updates). I've been able to run 16 threads in parallel
that each perform about 400 mutates/s on a loaded cluster.
Then I thought, hey, let's get rid of the network round trip and batch this
thing...

So I converted my code to use a mutator and addCounter instead of
insertCounter (on Hector). However, when I do, the results are always bad.
When I execute()
 - every 5000 lines, I get wonderful performance but I constantly get
Timeouts
 - every 500, same thing
 - every 10, the timeouts take longer to appear but they're still there
 - every 1, it works just like before batching
And this happens even with a single thread running

So my question is not about the absolute performance of my cluster but about
how I'm supposed to use batch updates : it doesn't look like the execute()
call blocks until it's performed the mutation and tpstats has showed up to
200.000 mutations pending.

Any ideas ?

Mime
View raw message