cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Low (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5525) Adding nodes to 1.2 cluster w/ vnodes streamed more data than average node load
Date Tue, 30 Apr 2013 10:50:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645445#comment-13645445
] 

Richard Low commented on CASSANDRA-5525:
----------------------------------------

Could you attach the output of 'nodetool ring' to list all the tokens?  Also what is your
replication factor?

There is a balancing problem when adding new nodes without running shuffle (or decommissioning
and bootstrapping each node).  When Cassandra increases the number of tokens from 1 to N (256
in your case), it splits the original ranges into N consecutive ranges.  This doesn't change
where the data lives but does increase the number of tokens.

Cassandra knows that the adjacent tokens are on the same node so doesn't try to store replicas
on the same node.  It looks for the next range on another node, just like how multi DC replication
ensure replicas are in different data centers.

Now when a new node is added, it doesn't choose adjacent tokens, it has them spread randomly
around the ring.  Just one of these small ranges could hold replicas for lots of data, because
it becomes the next node in the ring.  For high enough replication factor and certain (quite
likely) choices of tokens, a new node could end up storing 100% of the data.  This could explain
what you are seeing, will need to see the token list and RF to confirm.
                
> Adding nodes to 1.2 cluster w/ vnodes streamed more data than average node load
> -------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5525
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5525
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: John Watson
>         Attachments: Screen Shot 2013-04-25 at 12.35.24 PM.png
>
>
> 12 node cluster upgraded from 1.1.9 to 1.2.3, enabled 'num_tokens: 256', restarted and
ran upgradesstables and cleanup.
> Tried to join 2 additional nodes into the ring.
> However, 1 of the new nodes ran out of disk space. This started causing 'no host id'
alerts in the live cluster when attempting to store hints for that node.
> {noformat}
> ERROR 10:12:02,408 Exception in thread Thread[MutationStage:190,5,main]
> java.lang.AssertionError: Missing host ID 
> {noformat}
> The other node I killed to stop it from continuing to join. Since the live cluster was
now in some sort of broken state dropping mutation messages on 3 nodes. This was fixed by
restarting them, however 1 node never stopped, so had to decomm it (leaving the original cluster
at 11 nodes.)
> Ring pre-join:
> {noformat}
> Load       Tokens  Owns (effective)  Host ID                             
> 147.55 GB  256     16.7%             754f9f4c-4ba7-4495-97e7-1f5b6755cb27
> 124.99 GB  256     16.7%             93f4400a-09d9-4ca0-b6a6-9bcca2427450
> 136.63 GB  256     16.7%             ff821e8e-b2ca-48a9-ac3f-8234b16329ce
> 141.78 GB  253     100.0%            339c474f-cf19-4ada-9a47-8b10912d5eb3
> 137.74 GB  256     16.7%             6d726cbf-147d-426e-a735-e14928c95e45
> 135.9 GB   256     16.7%             e59a02b3-8b91-4abd-990e-b3cb2a494950
> 165.96 GB  256     16.7%             83ca527c-60c5-4ea0-89a8-de53b92b99c8
> 135.41 GB  256     16.7%             c3ea4026-551b-4a14-a346-480e8c1fe283
> 143.38 GB  256     16.7%             df7ba879-74ad-400b-b371-91b45dcbed37
> 178.05 GB  256     25.0%             78192d73-be0b-4d49-a129-9bec0770efed
> 194.92 GB  256     25.0%             361d7e31-b155-4ce1-8890-451b3ddf46cf
> 150.5 GB   256     16.7%             9889280a-1433-439e-bb84-6b7e7f44d761
> {noformat}
> Ring after decomm bad node:
> {noformat}
> Load       Tokens  Owns (effective)  Host ID
> 80.95 GB   256     16.7%             754f9f4c-4ba7-4495-97e7-1f5b6755cb27
> 87.15 GB   256     16.7%             93f4400a-09d9-4ca0-b6a6-9bcca2427450
> 98.16 GB   256     16.7%             ff821e8e-b2ca-48a9-ac3f-8234b16329ce
> 142.6 GB   253     100.0%            339c474f-cf19-4ada-9a47-8b10912d5eb3
> 77.64 GB   256     16.7%             e59a02b3-8b91-4abd-990e-b3cb2a494950
> 194.31 GB  256     25.0%             6d726cbf-147d-426e-a735-e14928c95e45
> 221.94 GB  256     33.3%             83ca527c-60c5-4ea0-89a8-de53b92b99c8
> 87.61 GB   256     16.7%             c3ea4026-551b-4a14-a346-480e8c1fe283
> 101.02 GB  256     16.7%             df7ba879-74ad-400b-b371-91b45dcbed37
> 172.44 GB  256     25.0%             78192d73-be0b-4d49-a129-9bec0770efed
> 108.5 GB   256     16.7%             9889280a-1433-439e-bb84-6b7e7f44d761
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message