cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@agentisenergy.com>
Subject Re: Partitioning, tokens, and sequential keys
Date Tue, 16 Aug 2011 19:04:28 GMT
Currently we have the initial_token for the seed node blank, and then the
three tokens we ended  up with are:
56713727820156410577229101238628035242
61396109050359754194262152792166260437
113427455640312821154458202477256070485

I would assume that we'd want to take the node that
is 61396109050359754194262152792166260437 and move it to 0, yes?

In theory that should largely balance out our data... or am I missing
something there?

On Tue, Aug 16, 2011 at 1:54 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> what tokens did you end up using?
>
> are you sure it's actually due to different amounts of rows?  have you
> run cleanup and compact to make sure it's not unused data / obsolete
> replicas taking up the space?
>
> On Tue, Aug 16, 2011 at 1:41 PM, David McNelis
> <dmcnelis@agentisenergy.com> wrote:
> > We are currently running a three node cluster where we assigned the
> initial
> > tokens using the Python script that is in the Wiki, and we're currently
> > using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
> > ....however we're seeing one node taken on over 60% of the data as we
> load
> > data.
> > Our keys are sequential, and can range from 0 to 2^64, though in practice
> > we're between 1 and 2,000,000,000, with the current  max around 50,000.
> In
> > order to balance out the  load would we be best served changing our
> tokens
> > to make the top and bottom 1/3rd of the node go to the previous and next
> > nodes respectively, then running nodetool move?
> > Even if we do that, it would seem that we'd likely continue to run into
> this
> > sort of issue as  we  add  additionally data... would we be better served
> > with a different Partitioner strategy?  Or will we need to very actively
> > manage our tokens to avoid getting into an unbalanced situation?
> >
> > --
> > David McNelis
> > Lead Software Engineer
> > Agentis Energy
> > www.agentisenergy.com
> > o: 630.359.6395
> > c: 219.384.5143
> > A Smart Grid technology company focused on helping consumers of energy
> > control an often under-managed resource.
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Mime
View raw message