incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bart Swedrowski <b...@timedout.org>
Subject One ColumnFamily places data on only 3 out of 4 nodes
Date Mon, 12 Dec 2011 15:25:12 GMT
Hello everyone,

I seem to have came across rather weird (at least for me!) problem /
behaviour with Cassandra.

I am running a 4-nodes cluster on Cassandra 0.8.7.  For the keyspace in
question, I have RF=3, SimpleStrategy with multiple ColumnFamilies inside
the KeySpace.  On of the ColumnFamilies however seems to have data
distributed across only 3 out of 4 nodes.

The data on the cluster beside the problematic ColumnFamily seems to be
more or less equal and even.

# nodetool -h localhost ring
Address         DC          Rack        Status State   Load            Owns
   Token

   127605887595351923798765477786913079296
192.168.81.2    datacenter1 rack1       Up     Normal  7.27 GB
25.00%  0
192.168.81.3    datacenter1 rack1       Up     Normal  7.74 GB
25.00%  42535295865117307932921825928971026432
192.168.81.4    datacenter1 rack1       Up     Normal  7.38 GB
25.00%  85070591730234615865843651857942052864
192.168.81.5    datacenter1 rack1       Up     Normal  7.32 GB
25.00%  127605887595351923798765477786913079296

Schema for the relevant bits of the keyspace is as follows:

[default@A] show schema;
create keyspace A
  with placement_strategy = 'SimpleStrategy'
  and strategy_options = [{replication_factor : 3}];
[...]
create column family UserDetails
  with column_type = 'Standard'
  and comparator = 'IntegerType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'BytesType'
  and memtable_operations = 0.571875
  and memtable_throughput = 122
  and memtable_flush_after = 1440
  and rows_cached = 0.0
  and row_cache_save_period = 0
  and keys_cached = 200000.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'ConcurrentLinkedHashCacheProvider';

And now the symptoms - output of 'nodetool -h localhost cfstats' on each
node.  Please note the figures on node1.

*node1*
Column Family: UserDetails
SSTable count: 0
Space used (live): 0
Space used (total): 0
Number of Keys (estimate): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0

*node2*
Column Family: UserDetails
SSTable count: 3
Space used (live): 112952788
Space used (total): 164953743
Number of Keys (estimate): 384
Memtable Columns Count: 159419
Memtable Data Size: 74910890
Memtable Switch Count: 59
Read Count: 135307426
Read Latency: 25.900 ms.
Write Count: 3474673
Write Latency: 0.040 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 120
Key cache hit rate: 0.999971684189041
Row cache: disabled
Compacted row minimum size: 42511
Compacted row maximum size: 74975550
Compacted row mean size: 42364305

*node3*
Column Family: UserDetails
SSTable count: 3
Space used (live): 112953137
Space used (total): 112953137
Number of Keys (estimate): 384
Memtable Columns Count: 159421
Memtable Data Size: 74693445
Memtable Switch Count: 56
Read Count: 135304486
Read Latency: 25.552 ms.
Write Count: 3474616
Write Latency: 0.036 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 109
Key cache hit rate: 0.9999716840888175
Row cache: disabled
Compacted row minimum size: 42511
Compacted row maximum size: 74975550
Compacted row mean size: 42364305

*node4*
Column Family: UserDetails
SSTable count: 3
Space used (live): 117070926
Space used (total): 119479484
Number of Keys (estimate): 384
Memtable Columns Count: 159979
Memtable Data Size: 75029672
Memtable Switch Count: 60
Read Count: 135294878
Read Latency: 19.455 ms.
Write Count: 3474982
Write Latency: 0.028 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 119
Key cache hit rate: 0.9999752235777154
Row cache: disabled
Compacted row minimum size: 2346800
Compacted row maximum size: 62479625
Compacted row mean size: 42591803

When I go to 'data' directory on node1 there is no files regarding the
UserDetails ColumnFamily.

I tried performing manual repair in hope it will heal the situation,
however without any luck.

# nodetool -h localhost repair A UserDetails
 INFO 15:19:54,611 Starting repair command #8, repairing 3 ranges.
 INFO 15:19:54,647 Sending AEService tree for #<TreeRequest
manual-repair-89c1acb0-184c-438f-bab8-7ceed27980ec, /192.168.81.2,
(A,UserDetails),
(85070591730234615865843651857942052864,127605887595351923798765477786913079296]>
 INFO 15:19:54,742 Endpoints /192.168.81.2 and /192.168.81.3 are consistent
for UserDetails on
(85070591730234615865843651857942052864,127605887595351923798765477786913079296]
 INFO 15:19:54,750 Endpoints /192.168.81.2 and /192.168.81.5 are consistent
for UserDetails on
(85070591730234615865843651857942052864,127605887595351923798765477786913079296]
 INFO 15:19:54,751 Repair session
manual-repair-89c1acb0-184c-438f-bab8-7ceed27980ec (on cfs
[Ljava.lang.String;@3491507b, range
(85070591730234615865843651857942052864,127605887595351923798765477786913079296])
completed successfully
 INFO 15:19:54,816 Sending AEService tree for #<TreeRequest
manual-repair-6d2438ca-a05c-4217-92c7-c2ad563a92dd, /192.168.81.2,
(A,UserDetails),
(42535295865117307932921825928971026432,85070591730234615865843651857942052864]>
 INFO 15:19:54,865 Endpoints /192.168.81.2 and /192.168.81.4 are consistent
for UserDetails on
(42535295865117307932921825928971026432,85070591730234615865843651857942052864]
 INFO 15:19:54,874 Endpoints /192.168.81.2 and /192.168.81.5 are consistent
for UserDetails on
(42535295865117307932921825928971026432,85070591730234615865843651857942052864]
 INFO 15:19:54,874 Repair session
manual-repair-6d2438ca-a05c-4217-92c7-c2ad563a92dd (on cfs
[Ljava.lang.String;@7e541d08, range
(42535295865117307932921825928971026432,85070591730234615865843651857942052864])
completed successfully
 INFO 15:19:54,909 Sending AEService tree for #<TreeRequest
manual-repair-98d1a21c-9d6e-41c8-8917-aea70f716243, /192.168.81.2,
(A,UserDetails), (127605887595351923798765477786913079296,0]>
 INFO 15:19:54,967 Endpoints /192.168.81.2 and /192.168.81.3 are consistent
for UserDetails on (127605887595351923798765477786913079296,0]
 INFO 15:19:54,974 Endpoints /192.168.81.2 and /192.168.81.4 are consistent
for UserDetails on (127605887595351923798765477786913079296,0]
 INFO 15:19:54,975 Repair session
manual-repair-98d1a21c-9d6e-41c8-8917-aea70f716243 (on cfs
[Ljava.lang.String;@48c651f2, range
(127605887595351923798765477786913079296,0]) completed successfully
 INFO 15:19:54,975 Repair command #8 completed successfully

As I am using SimpleStrategy I would expect the keys to be split, more or
less, equally across the nodes, however this don't seem to be the case.

Has anyone came across similar behaviour before?  Has anyone have any
suggestions what I could do to bring some data into node1?  Obviously, this
kind of data split means node2, node3 and node4 need to do all the read
work which is not ideal.

Any suggestions much appreciated.

Kind regards,
Bart

Mime
View raw message