On Wed, Oct 2, 2013 at 8:12 AM, Keith Wright <kwright@nanigans.com> wrote:
   We are running C* 1.2.8 with Vnodes enabled and are attempting to bootstrap a new node and are having issues.  When we add the node we see it bootstrap and we see data start to stream over from other nodes however we are seeing one of the other nodes get stuck in full GCs to the point where we had to restart one of the nodes.  I assume this is because building the merkle tree is expensive.

Merkle trees are only involved in "repair", not in normal bootstrap. Have you considered lowering the throttle for streaming? Bootstrap will be slower but should be less likely to overwhelm heap.
 
Any way to force the streaming to restart?   Have others seen this?

In the bootstrap case, you can just wipe the bootstrapping node and re-start the bootstrap.

In the general case regarding hung streaming :

https://issues.apache.org/jira/browse/CASSANDRA-3486

The only solution to hung non-bootstrap streaming is restart all nodes participating in the streaming. With vnodes, this will probably approach 100% of nodes...

=Rob