cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject 4/20 nodes get disproportionate amount of mutations
Date Tue, 23 Aug 2011 06:40:20 GMT
We've been having issues where as soon as we start doing heavy writes (via hadoop) recently,
it really hammers 4 nodes out of 20.  We're using random partitioner and we've set the initial
tokens for our 20 nodes according to the general spacing formula, except for a few token offsets
as we've replaced dead nodes.

When I say hammers, I look at nodetool tpstats: those 4 nodes have completed something like
70 million mutation stage events whereas the rest of the cluster have completed from 2-20
million mutation stage events.  Therefore, on the 4 nodes, we find in the logs there is evidence
of backing up in the mutation stage and a lot of read repair message drops.  It looks like
there is quite a bit of flushing is going on and consequently auto minor compactions.

We are running 0.7.8 and have about 34 column families (when counting secondary indexes as
column families) so we can't get too large with our memtable throughput in mb.  We would like
to upgrade to 0.8.4 (not least because of JAMM) but it seems that something else is going
on with our cluster if we are using RP and balanced initial tokens and still have 4 hot nodes.

Do these symptoms and context sound familiar to anyone?  Does anyone have any suggestions
as to how to address this kind of case - disproportionate write load?

Thanks,

Jeremy
Mime
View raw message