On Wed, Feb 5, 2014 at 11:00 AM, Keith Wright <kwright@nanigans.com> wrote:
    Earlier today I emailed about issues we’re having bootstrapping nodes into our existing cluster.  One theory we have is that our nodes are simply too large and are considering moving to more, smaller nodes.  However, because we cannot bootstrap it makes it difficult.  As I see it, we have two options (assuming the new cluster is already setup and running):

First, the problems you describe seem unusual. There are other people with 1T node sizes who are able to add and remove nodes from their clusters.

Streaming is fragile, especially so before fixes in 1.2 and the wholesale re-write in 2.0. But it is rare for streaming to be so fragile that bootstrap never succeeds. If I were you I would expend some more effort on trying to understand why you are in this somewhat unusual case before taking the extreme step of resizing your nodes.

Rebuild operation is in fact effectively the same as bootstrap. Repair of an empty node is also similar, in that it will stream a large set of SSTables and that streaming could hang.

SSTableLoader... also uses streaming. Why will your new SSTableloader, streaming to your new cluster, be less likely to hang a stream than your current cluster?

Depending on the migration in question, you could try the "copy-the-sstables-and-then-cleanup" method described here :

http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

Other than not using SSTableLoader, it is effectively the dual writes solution you propose.

=Rob