cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Partitioning, tokens, and sequential keys
Date Tue, 16 Aug 2011 18:54:35 GMT
what tokens did you end up using?

are you sure it's actually due to different amounts of rows?  have you
run cleanup and compact to make sure it's not unused data / obsolete
replicas taking up the space?

On Tue, Aug 16, 2011 at 1:41 PM, David McNelis
<dmcnelis@agentisenergy.com> wrote:
> We are currently running a three node cluster where we assigned the initial
> tokens using the Python script that is in the Wiki, and we're currently
> using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
> ....however we're seeing one node taken on over 60% of the data as we load
> data.
> Our keys are sequential, and can range from 0 to 2^64, though in practice
> we're between 1 and 2,000,000,000, with the current  max around 50,000.   In
> order to balance out the  load would we be best served changing our tokens
> to make the top and bottom 1/3rd of the node go to the previous and next
> nodes respectively, then running nodetool move?
> Even if we do that, it would seem that we'd likely continue to run into this
> sort of issue as  we  add  additionally data... would we be better served
> with a different Partitioner strategy?  Or will we need to very actively
> manage our tokens to avoid getting into an unbalanced situation?
>
> --
> David McNelis
> Lead Software Engineer
> Agentis Energy
> www.agentisenergy.com
> o: 630.359.6395
> c: 219.384.5143
> A Smart Grid technology company focused on helping consumers of energy
> control an often under-managed resource.
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message