incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mina Naguib <mina.nag...@bloomdigital.com>
Subject Equalizing nodes storage load
Date Fri, 22 Jul 2011 14:37:46 GMT

Hi everyone

I've been struggling trying to get the data volume ("load") to equalize across a balanced
cluster, and I'm not sure what else I can try.

Background: This was originally a 5-node cluster.  We re-balanced the 3 faster machines across
the ring, and decommissioned the 2 older ones.  We also upgraded cassandra a few times from
0.7.4 through 0.7.5, 0.7.6-2 to 0.7.7.  The ring currently looks like so:

Address         Status State   Load            Owns    Token                             
         
                                                       151236607520417094872610936636341427313
    
xx.xx.x.105     Up     Normal  41.98 GB        33.33%  37809151880104273718152734159085356828
     
xx.xx.x.107     Up     Normal  59.4 GB         33.33%  94522879700260684295381835397713392071
     
xx.xx.x.18      Up     Normal  74.65 GB        33.33%  151236607520417094872610936636341427313
    

What I've tried to far:
	1. Running repair on each node (sequentially of course).
	2. Running cleanup on the largest node (.18) hoping it would shed unneeded data

The repairs helped a bit by, slightly, bumping up the load of the first 2 machines, but the
cleanup on the 3rd failed to reduce its data volume.

So, at this point, I'm out of ideas.  In terms of tpstats metrics, each of the 3 nodes is
serving roughly the same volume of ReadStage and MutationStage, so they're balanced in that
respect.  However I'm concerned about the imbalance of the data load ( 24% / 34% / 42% ) and
being unable to equalize it.

For the record, there's only 1 keyspace of meaningful data in the cluster, with the following
schema settings:
Keyspace: ZZZZZZ:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
    Options: [DCMTL:2]
  Column Families:
    ColumnFamily: AAAAAAAAAA
      default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds: 256000.0/0
      Key cache size / save period in seconds: 200000.0/14400
      Memtable thresholds: 0.88125/1440/188 (millions of ops/minutes/MB)
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.1
      Built indexes: []
    ColumnFamily: BBBBB (Super)
      default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
      Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type/org.apache.cassandra.db.marshal.UTF8Type
      Row cache size / save period in seconds: 75000.0/0
      Key cache size / save period in seconds: 200000.0/14400
      Memtable thresholds: 0.88125/1440/188 (millions of ops/minutes/MB)
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      Built indexes: []

Any tips or ideas to help get the nodes' load equalized would be highly appreciated.  If this
is normal behaviour and I shouldn't be trying too hard to get it equalized, I'd appreciate
any notes/links explaining why.

Thank you.
Mime
View raw message