cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Molinaro <antho...@alumni.caltech.edu>
Subject Re: Clarification on Ring operations in Cassandra 0.5.1
Date Mon, 19 Apr 2010 23:48:23 GMT

On Mon, Apr 19, 2010 at 03:28:26PM -0500, Jonathan Ellis wrote:
> > Can I then 'nodeprobe move <token for range I want to take over>', and
> > achieve the same as step 2 above?
> 
> You can't have two nodes with the same token in the ring at once.  So,
> you can removetoken the old node first, then bootstrap the new one
> (just specify InitialToken in the config to avoid having it guess
> one), or you can make it a 3 step process (bootstrap, remove, move) to
> avoid transferring so much data around.

So I'm still a little fuzzy for your 3 step case on why less data moves,
but let me run through the two scenarios and see where we get.  Please
correct me if I'm wrong on some point.

Let say I have 3 nodes with random partitioner and rack unaware strategy.
Which means I have something like

Node  Size   Token  KeyRange (self + next in ring)
----  ----   -----  ------------------------------
A     5 G      33    1 -> 66
B     6 G      66       34 -> 0
C     2 G       0          67 -> 33

Now lets say Node B is giving us some problems, so we want to replace it
with another node D.

We've outlined 2 processes.

In the first process you recommend

1. removetoken on node B
2. wait for data to move
3. add InitialToken of 66 and AutoBootstrap = true to node D storage-conf.xml
   then start it
4. wait for data to move

So when you do the removetoken, this will cause the following transfers
at stage 2
  Node A sends 34->66 to Node C
  Node C sends 67->0  to Node A
at stage 4
  Node A sends 34->66 to Node D
  Node C sends 67->0  to Node D

In the second process I assume you pick a token really close to another token?

1. add InitialToken of 34 and AutoBootstrap to true to node D storage-conf.xml
   then start it
2. wait for data to move
3. removetoken on node B
4. wait for data to move
5. movetoken on node D to 66
6. wait for data to move

This results in the following moves
at stage 2
  Node A/B sends 33->34 to Node D (primary token range)
  Node B sends 34->66 to Node D   (replica range)
at stage 4
  Node C sends 66->0 to Node D (replica range)
at stage 6
  No data movement as D already had 33->0

So seems like you move all the data twice for process 1 and only a small
portion twice for process 2 (which is what you said, so hopefully I've
outlined correctly what is happening).  Does all that sound right?

Once I've run bootstrap with the InitialToken value set in the config is
it then ignored in subsequent restarts, and if so can I just remove it
after that first time?

Thanks,

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@alumni.caltech.edu>

Mime
View raw message