cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Lee" <>
Subject RE: Re: replace a bad node through bootstrapping
Date Fri, 15 Jan 2010 03:02:08 GMT
I' am Pan's collogue, allow me make it clear...

Pan's problem is:

If a node's data has been damaged, you cannot use new node replace old one directly, unless
'removetoken' first.

But, (suppose node A is dead)
'removetoken' will complement missing replica due A's death first, it will generate lot data
on other nodes, say it's B, C, D
After add new node and copy data from other node through bootstrapping, you have to 'cleanup'
data just 
generate from ' removetoken ' on B, C, D

So, B/C/D will have heavy I/O load (half of them is waste) due to repair A, in pan's case,
it will be 5TB (and will cause days...)

Pan try to invent a method to repair A directly through streaming, and have less impact on
other nodes.


-----Original Message-----
From: XL.Pan [] 
Sent: Friday, January 15, 2010 10:23 AM
To: cassandra-user; cassandra-user
Subject: Re: Re: replace a bad node through bootstrapping

| Range changes                                               |
| Bootstrap                                                      |
| Adding new nodes is called "bootstrapping."         |
Do you mean that "bootstrapping" is designed for adding  new nodes only?

I think the bootstrapping idea is good enough to do something else, for example that restoring
the data in a bad node, though it needs some modification if that.

What's the difference between a new one which is NOT in the ring before and  a new one which
is in the ring before?
I think there are some similarities and differences. (let new one called N, and the replaced
one called R)
* similarities:
Both N and R have no available data, as a result that both of which need to copy data from
the replication soureces.

* differences:
1) About writing while startup
    N is not seen before and of couse it has no handoff data in other nodes. As a result that
it should serve for writing which is routed from other nodes while coping data.
    R is seen before and it has handoff data in other nodes, so it will not care about losing
data while coping data from other sources and it will receive the handoff data after startup.
That means R has no need to serve for writing at that time.
2) About the selection of replication sources 
|     |
N want's to insert between B and C, so N knows that it can get data from C, D, A. After bootstrapping,
the ring will be:
|     |
|     N
|     |
Then the node N is down and replaced with R. Because the R has seen a different ring, it will
select B and C.

>From the comparison, I think it's possible that replacing a bad node and restore the data
through bootstrapping.


发件人:Jonathan Ellis
发送日期:2010-01-15 00:51:57
主题:Re: replace a bad node through bootstrapping

On Thu, Jan 14, 2010 at 6:30 AM, XL.Pan <> wrote:
> *Why not the standard boostrap? says that boostrap is the
preferred method for handling node replacement.  Please read how that
describes how to handle things because your description of how
bootstrap works is very off base.


View raw message