On Wed, Feb 5, 2014 at 11:18 AM, Keith Wright <kwright@nanigans.com> wrote:
Hi Rob, thanks for the response!  Interestingly if we run a repair we don’t see the bootstrap issue so I am considering doing the empty node repair methodology.

Weird. Bootstrap should not be more fragile than repair. 
  • Update our JRE, we are using 1.7.0_17 and I believe we’re up to 1.7.0_54
Unlikely to be the cause, but couldn't hurt. 
  • GC tuning as it does appear that we’re suffering from GC issues.  We could just allocate more eden space and then revert after the bootstrap succeeds
This is a generalized cause of streaming failures, so sure. I'm not so sure about the specific proposed solution, but yes, it's possible that tuning your GC will make bootstrap possible. 
  • As I mentioned, don’t load data via bootstrap but instead use repair.  With bootstrap disabled in Vnodes, will the node still assign itself tokens?
My belief is yes, and I just re-read the code and that's what it appears to do in the auto_bootstrap:false-with-num_tokens_set case.

You can verify for yourself by reading the code here :


There are other methods of doing this which would be available to you if you were not using vnodes. Unfortunately the use of vnodes seems to preclude any copy-the-sstables method of cluster shifting short of copying all sstables to all nodes, globally uniquing their filenames first, and then running cleanup.



Affects versions of Cassandra 1.2.x before 1.2.14, including the version of Cassandra you are running. It WILL REMOVE NODES FROM YOUR CLUSTER AND MAKE IT HARD TO GET THEM BACK IN IF YOU USE AUTO_BOOTSTRAP:FALSE UNDER CERTAIN CIRCUMSTANCES.

If you plan to use auto_bootstrap:false to deal with your issue, I VERY STRONGLY RECOMMEND UPGRADING TO 1.2.14 BEFORE DOING SO.

(The above warning applies to anyone using auto_bootstrap:false in 1.2.x, either stop doing that or upgrade to 1.2.14 ASAP.)