incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Partitioning, tokens, and sequential keys
Date Tue, 16 Aug 2011 20:33:20 GMT
Yes, that looks about right.

Totally baffled how the wiki script could spit out those tokens for a
3-node cluster.

On Tue, Aug 16, 2011 at 2:04 PM, David McNelis
<dmcnelis@agentisenergy.com> wrote:
> Currently we have the initial_token for the seed node blank, and then the
> three tokens we ended  up with are:
> 56713727820156410577229101238628035242
> 61396109050359754194262152792166260437
> 113427455640312821154458202477256070485
> I would assume that we'd want to take the node that
> is 61396109050359754194262152792166260437 and move it to 0, yes?
> In theory that should largely balance out our data... or am I missing
> something there?
> On Tue, Aug 16, 2011 at 1:54 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>
>> what tokens did you end up using?
>>
>> are you sure it's actually due to different amounts of rows?  have you
>> run cleanup and compact to make sure it's not unused data / obsolete
>> replicas taking up the space?
>>
>> On Tue, Aug 16, 2011 at 1:41 PM, David McNelis
>> <dmcnelis@agentisenergy.com> wrote:
>> > We are currently running a three node cluster where we assigned the
>> > initial
>> > tokens using the Python script that is in the Wiki, and we're currently
>> > using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
>> > ....however we're seeing one node taken on over 60% of the data as we
>> > load
>> > data.
>> > Our keys are sequential, and can range from 0 to 2^64, though in
>> > practice
>> > we're between 1 and 2,000,000,000, with the current  max around 50,000.
>> >   In
>> > order to balance out the  load would we be best served changing our
>> > tokens
>> > to make the top and bottom 1/3rd of the node go to the previous and next
>> > nodes respectively, then running nodetool move?
>> > Even if we do that, it would seem that we'd likely continue to run into
>> > this
>> > sort of issue as  we  add  additionally data... would we be better
>> > served
>> > with a different Partitioner strategy?  Or will we need to very actively
>> > manage our tokens to avoid getting into an unbalanced situation?
>> >
>> > --
>> > David McNelis
>> > Lead Software Engineer
>> > Agentis Energy
>> > www.agentisenergy.com
>> > o: 630.359.6395
>> > c: 219.384.5143
>> > A Smart Grid technology company focused on helping consumers of energy
>> > control an often under-managed resource.
>> >
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>
>
>
> --
> David McNelis
> Lead Software Engineer
> Agentis Energy
> www.agentisenergy.com
> o: 630.359.6395
> c: 219.384.5143
> A Smart Grid technology company focused on helping consumers of energy
> control an often under-managed resource.
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message