cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Chunlu <springri...@gmail.com>
Subject how to solve one node is in heavy load in unbalanced cluster
Date Thu, 28 Jul 2011 18:24:58 GMT
I have three nodes and RF=3.here is the current ring:


Address Status State Load Owns Token

84944475733633104818662955375549269696
node1 Up Normal 15.32 GB 81.09% 52773518586096316348543097376923124102
node2 Up Normal 22.51 GB 10.48% 70597222385644499881390884416714081360
node3 Up Normal 56.1 GB 8.43% 84944475733633104818662955375549269696


it is very un-balanced and I would like to re-balance it using
"nodetool move" asap. unfortunately I haven't been run node repair for
a long time.

aaron suggested it's better to run node repair on every node then re-balance it.


problem is the node3 is in heavy-load currently, and the entire
cluster slow down if I start doing node repair. I have to
disablegossip and disablethrift to stop the repair.

only cassandra running on that server and I have no idea what it was
doing. the cpu load is about 20+ currently. compcationstats and
netstats shows it was not doing anything.

I have change client to not to connect to node3, but still, it seems
in heavy load and io utils is 100%.


the log seems normal(although not sure what about the "Dropped read
message" thing):

 INFO 13:21:38,191 GC for ParNew: 345 ms, 627003992 reclaimed leaving
2563726360 used; max is 4248829952
 WARN 13:21:38,560 Dropped 826 READ messages in the last 5000ms
 INFO 13:21:38,560 Pool Name                    Active   Pending
 INFO 13:21:38,560 ReadStage                         8      7555
 INFO 13:21:38,561 RequestResponseStage              0         0
 INFO 13:21:38,561 ReadRepairStage                   0         0



is there anyway to tell what node3 was doing? or at least is there any
way to make it not slowdown the whole cluster?

Mime
View raw message