incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Cluster key distribution wrong after upgrading to 0.8.4
Date Sun, 21 Aug 2011 09:57:26 GMT
This looks like an artifact of the way ownership is calculated for the OOP. See https://github.com/apache/cassandra/blob/cassandra-0.8.4/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java#L177
it was changed in this ticket
https://issues.apache.org/jira/browse/CASSANDRA-2800

The change applied in CASSANDRA-2800 was not applied to the AbstractByteOrderPartitioner.
Looks like it should have been. I'll chase that up. 
 
When each node calculates the ownership for the token ranges (for OOP and BOP) it's based
on the number of keys the node has in that range. As there is no way for the OOP to understand
the range of values the keys may take. If you look at the 192 node it's showing ownership
most with 192, 191 and 190 - so i'm assuming RF3 and 192 also has data from the ranges owned
by 191 and 190. 

IMHO you can ignore this. 

You can use load the the number of keys estimate from cfstats to get an idea of whats happening.


Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19/08/2011, at 9:42 PM, Thibaut Britz wrote:

> Hi,
> 
> we were using apache-cassandra-2011-06-28_08-04-46.jar so far in
> production and wanted to upgrade to 0.8.4.
> 
> Our cluster was well balanced and we only saved keys with a lower case
> md5 prefix. (Orderpreserving partitioner).
> Each node owned 20% of the tokens, which was also displayed on each
> node in nodetool -h localhost ring.
> 
> After upgrading, our well balanced cluster shows completely wrong
> percentage on who owns which keys:
> 
> *.*.*.190:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 34.57%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 0.02%   55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 65.36%  ffffffffffffffff
> 
> *.*.*.191:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 36.46%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 26.02%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 37.48%  ffffffffffffffff
> 
> *.*.*.192:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 38.16%  2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 27.61%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 34.17%  80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 0.02%   aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 0.02%   ffffffffffffffff
> 
> *.*.*.194:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 0.03%   2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 31.43%  55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 39.69%  80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 28.82%  aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 0.03%   ffffffffffffffff
> 
> *.*.*.196:
> Address         DC          Rack        Status State   Load
> Owns    Token
> 
>        ffffffffffffffff
> *.*.*.190   datacenter1 rack1       Up     Normal  87.95 GB
> 0.02%   2a
> *.*.*.191   datacenter1 rack1       Up     Normal  84.3 GB
> 0.02%   55
> *.*.*.192   datacenter1 rack1       Up     Normal  79.46 GB
> 0.02%   80
> *.*.*.194   datacenter1 rack1       Up     Normal  68.16 GB
> 27.52%  aa
> *.*.*.196   datacenter1 rack1       Up     Normal  79.9 GB
> 72.42%  ffffffffffffffff
> 
> 
> Interestingly, each server shows something completely different.
> 
> Removing the locationInfo files didn't help.
> -Dcassandra.load_ring_state=false didn't help as well.
> 
> Our cassandra.yaml is at http://pastebin.com/pCVCt3RM
> 
> Any idea on what might cause this? Is it save to suspect that
> operating under this distribution will cause severe data loss? Or can
> I safely ignore this?
> 
> Thanks,
> Thibaut


Mime
View raw message