The other nodes all have copies of the same data. To optimize performance, all of them stream different parts of the data, even though 102 has all the data that 108 needs. (I think. I'm not an expert.) -Brennan


On Thu, Nov 1, 2012 at 9:31 AM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
I am trying to bootstrap cassandra 1.0.10 cluster of 7 nodes to 14 nodes.

My seed nodes are 101, 102, 103 and 104.

Here is my initial ring

Address         DC          Rack        Status State   Load            Owns    Token                                       
                                                                               145835300108973627198589117470757804908     
192.168.1.101   datacenter1 rack1       Up     Normal  8.16 GB         14.29%  0                                           
192.168.1.102   datacenter1 rack1       Up     Normal  8.68 GB         14.29%  24305883351495604533098186245126300818      
192.168.1.103   datacenter1 rack1       Up     Normal  8.45 GB         14.29%  48611766702991209066196372490252601636      
192.168.1.104   datacenter1 rack1       Up     Normal  8.16 GB         14.29%  72917650054486813599294558735378902454      
192.168.1.105   datacenter1 rack1       Up     Normal  8.33 GB         14.29%  97223533405982418132392744980505203272      
192.168.1.106   datacenter1 rack1       Up     Normal  8.71 GB         14.29%  121529416757478022665490931225631504090     
192.168.1.107   datacenter1 rack1       Up     Normal  8.41 GB         14.29%  145835300108973627198589117470757804908  

I add a new node 108 with the initial_token between 101 and 102.  After I start bootstrapping, I see the node is placed in the ring in correct place

Address         DC          Rack        Status State   Load            Owns    Token                                       
                                                                               145835300108973627198589117470757804908     
192.168.1.101   datacenter1 rack1       Up     Normal  8.16 GB         14.29%  0                                           
192.168.1.108   datacenter1 rack1       Up     Joining 114.61 KB       7.14%   12152941675747802266549093122563150409      
192.168.1.102   datacenter1 rack1       Up     Normal  8.68 GB         7.14%   24305883351495604533098186245126300818      
192.168.1.103   datacenter1 rack1       Up     Normal  8.4 GB          14.29%  48611766702991209066196372490252601636      
192.168.1.104   datacenter1 rack1       Up     Normal  8.15 GB         14.29%  72917650054486813599294558735378902454      
192.168.1.105   datacenter1 rack1       Up     Normal  8.33 GB         14.29%  97223533405982418132392744980505203272      
192.168.1.106   datacenter1 rack1       Up     Normal  8.71 GB         14.29%  121529416757478022665490931225631504090     
192.168.1.107   datacenter1 rack1       Up     Normal  8.41 GB         14.29%  145835300108973627198589117470757804908   

What puzzles me is when I look at the netstats I see nodes 107,104 and 103 are streaming data to 108.   Can someone explain why this happens?  I was under the impression that only node 102 needs to split the tokens and send to 108. Am I missing something?


Streaming from: /192.168.1.107
Streaming from: /192.168.1.104
Streaming from: /192.168.1.103


Thanks
Ramesh