cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manu Zhang <owenzhang1...@gmail.com>
Subject Re: how RandomPartitioner calculate tokens
Date Wed, 30 Jan 2013 10:10:15 GMT
On Wed 30 Jan 2013 05:47:59 PM CST, Sylvain Lebresne wrote:
> I'll admit that this part of the DataStax documentation is a bit
> confusing (and
> I'll reach to the doc writers to make sure this is improved).
>
> The partitioner (being it RandomPartitioner, Murmur3Partitioner or
> OrderPreservingPartitioner) is pretty much only a hash function that
> defines
> how to compute the token (it's hash) of a key. In particular, the
> partitioner
> has no notion whatsoever of data centers and more generally does not
> depend in
> any way of how many nodes you have.
>
> However, for actually distribute data, each node is assigned a token (or
> multiple ones with "vnodes"). Getting an even distribution of data
> depends on
> the exact token picked for your nodes.
>
> Now, the sentences of the doc you cite actually refer to how to
> calculate the
> tokens you assign to nodes. In particular, what it describes is pretty
> much
> what the small token-generator tool that comes with Cassandra
> (http://goo.gl/rwea9) does, but is not something Cassandra itself actually
> does.
>
> Also, that procedure to compute token is pretty much the same for
> RandomPartitioner and Murmur3Partitioner, except that the token range
> for both
> partitioner is not exactly the same. And as a side note, if you use
> vnodes, you
> don't really have to bother about manually assigning tokens for nodes.
>
> --
> Sylvain
>
>
> On Wed, Jan 30, 2013 at 9:22 AM, Manu Zhang <owenzhang1990@gmail.com
> <mailto:owenzhang1990@gmail.com>> wrote:
>
>     Hi,
>
>     As per the Datastax Cassandra Documentation 1.2,
>
>     "for single data center deployments, tokens are calculated by
>     dividing the hash range by the number of nodes in the cluster",
>     *does it mean we have to recalculate the tokens of keys when nodes
>     come and go?**
>     *
>     "for multiple data center deployments, tokens are calculated per
>     data center so that the hash range is evenly divide for the nodes
>     in each data center." *This is understandable, but when I go to
>     the getToken method of RandomPartitioner, I can't find any
>     datacenter-aware token calculation* *codes.
>
>     By the way, the documentation doesn't mention how
>     Murmur3Partitioner calculate tokens for multiple data center.
>     Assuming it doesn't calculate tokens per data center, what
>     difference between Murmur3Partitioner and RandomPartitioner has
>     made that unnecessary?
>
>     *Thanks.
>     *
>     *Manu Zhang*
>
>
>
>
>     *
>
>

Thanks Sylvain, it's all clear now.

Mime
View raw message