cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Brotman (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-14265) Add explanation of vNodes to online documentation
Date Thu, 01 Mar 2018 13:09:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16381968#comment-16381968
] 

Kenneth Brotman edited comment on CASSANDRA-14265 at 3/1/18 1:08 PM:
---------------------------------------------------------------------

>From the Dynamo paper:

The basic consistent hashing algorithm presents some challenges. First, the random position
assignment of each node on the ring leads to non-uniform data and load distribution. Second,
the basic algorithm is oblivious to the heterogeneity in the performance of nodes. To address
these issues, Dynamo uses a variant of consistent hashing (similar to the one used in [10,
20]): instead of mapping a node to a single point in the circle, each node gets assigned to
multiple points in the ring. To this end, Dynamo uses the concept of “virtual nodes”.
A virtual node looks like a single node in the system, but each node can be responsible for
more than one virtual node. Effectively, when a new node is added to the system, it is assigned
multiple positions (henceforth, “tokens”) in the ring. 

Using virtual nodes has the following advantages:

• If a node becomes unavailable (due to failures or routine maintenance), the load handled
by this node is evenly dispersed across the remaining available nodes.

• When a node becomes available again, or a new node is added to the system, the newly available
node accepts a roughly equivalent amount of load from each of the other available nodes. 

• The number of virtual nodes that a node is responsible can decided based on its capacity,
accounting for heterogeneity in the physical infrastructure


was (Author: kenbrotman):
>From the Dynamo paper:

The basic consistent hashing algorithm presents some challenges. First, the random position
assignment of each node on the ring leads to non-uniform data and load distribution. Second,
the basic algorithm is oblivious to the heterogeneity in the performance of nodes. To address
these issues, Dynamo uses a variant of consistent hashing (similar to the one used in [10,
20]): instead of mapping a node to a single point in the circle, each node gets assigned to
multiple points in the ring. To this end, Dynamo uses the concept of “virtual nodes”.
A virtual node looks like a single node in the system, but each node can be responsible for
more than one virtual node. Effectively, when a new node is added to the system, it is assigned
multiple positions (henceforth, “tokens”) in the ring. The process of fine-tuning Dynamo’s
partitioning scheme is discussed in Section 6. Using virtual nodes has the following advantages:
• If a node becomes unavailable (due to failures or routine maintenance), the load handled
by this node is evenly dispersed across the remaining available nodes. • When a node becomes
available again, or a new node is added to the system, the newly available node accepts a
roughly equivalent amount of load from each of the other available nodes.  • The number
of virtual nodes that a node is responsible can decided based on its capacity, accounting
for heterogeneity in the physical infrastructure

> Add explanation of vNodes to online documentation
> -------------------------------------------------
>
>                 Key: CASSANDRA-14265
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14265
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Documentation and Website
>            Reporter: Kenneth Brotman
>            Priority: Major
>
> A lot of inquiries on the mailing list about how vNodes work and how to set configuration
properly.  We should add an explanation to the documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message