incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: OrderPreservingPartitioner and manual token assignment
Date Tue, 22 Jun 2010 12:34:46 GMT
2010/6/22 Maxim Kramarenko <maximkr@trackstudio.com>:
> Hello!
>
> I use OrderPreservingPartitioner and assign tokens manually.
>
> Questions are:
>
> 1) Why range sorted in alphabetical order, not numeric order ?
> It was ok with RandomPartitioner

With RandomPartitioner, tokens are md5 hashes, thus number and the
comparison between two tokens is the numeric one.

With OrdrerPreservingPartitioner, tokens are the keys themselves, that is
to say Strings, and the comparison is (utf8) String comparison (hence the
alphabetic sorting). Note that as such, when switching from RP to OPP,
you most certainly don't want to keep the same tokens (as they represents
very different things (md5 hahes vs string key)).

>
> Address       Status     Load          Range           Ring
>
> 84000000000000000000000000000000000000
> 172.19.0.35   Up         2.47 GB       0           |<--|
> 172.19.0.31   Up         1.85 GB 112000000000000000000000000000000000000
>  |   ^
> 172.19.0.33   Up         1.46 GB 142000000000000000000000000000000000000
>  v   |
> 172.19.0.30   Up         1.44 GB 28000000000000000000000000000000000000
> |   ^
> 172.19.0.32   Up         2.63 GB 56000000000000000000000000000000000000
> v   |
> 172.19.0.34   Up         3.29 GB 84000000000000000000000000000000000000
> |-->|
>
> 2) what is the token range ? For example, all our keys starts with customer
> number (a few digits), but number is only small part of ASCII table.
>
> What is the best way to assign tokens manually when using
> OrderPreservingPartitioner ?

The first thing is to find (estimate most probably) the domain and repartition
of the key you will use (note that this is really the hard part as
most of the time you
can only guess what the repartition will be and most of the time you
will be wrong
anyway and get bad load balancing).
But when you know that, you just assign as tokens the particular keys
that split this
repartition the more evenly possible (and split here is with respect
to (utf8) string
comparison).

--
Sylvain

>
> --
> Best regards,
>  Maxim                            mailto:maximkr@trackstudio.com
>
> LinkedIn Profile: http://www.linkedin.com/in/maximkr
> Google Talk/Jabber: maximkr@gmail.com
> ICQ number: 307863079
> Skype Chat: maxim.kramarenko
> Yahoo! Messenger: maxim_kramarenko
>

Mime
View raw message