I dont know much about streaming in vnodes but you might be hitting this
https://issues.apache.org/jira/browse/CASSANDRA-4650


On Tue, Jul 2, 2013 at 12:43 PM, Mike Heffner <mike@librato.com> wrote:
As a test, adding a 7th node in the first AZ will stream from both the two existing nodes in the same AZ.

Aggregate streaming bandwidth at the 7th node is approximately 12 MB/sec when all limits are set at 800 MB/sec, or about double what I saw streaming from a single node. This would seem to indicate that the sending node is limiting our streaming rate.

Mike


On Tue, Jul 2, 2013 at 3:00 PM, Mike Heffner <mike@librato.com> wrote:
Sankalp,

Parallel sstableloader streaming would definitely be valuable.

However, this ring is currently using vnodes and I was surprised to see that a bootstrapping node only streamed from one node in the ring. My understanding was that a bootstrapping node would stream from multiple nodes in the ring.

We started with a 3 node/3 AZ, RF=3 ring. We then increased that to 6 nodes, adding one per AZ. The 4th, 5th and 6th nodes only streamed from the node in their own AZ/rack which led to the serial sstable streaming. Is this the correct behavior for the snitch? Is there an option to stream from multiple replicas across the az/rack configuration?

Mike


On Tue, Jul 2, 2013 at 1:53 PM, sankalp kohli <kohlisankalp@gmail.com> wrote:
This was a problem pre vnodes. I had several JIRA for that but some of them were voted down saying the performance will improve with vnodes. 
The main problem is that it streams one sstable at a time and not in parallel. 

Jira 4784 can speed up the bootstrap performance. You can also do a zero copy and not touch the caches of the nodes which are contributing in the build. 




On Tue, Jul 2, 2013 at 7:35 AM, Mike Heffner <mike@librato.com> wrote:

On Mon, Jul 1, 2013 at 10:06 PM, Mike Heffner <mike@librato.com> wrote:

The only changes we've made to the config (aside from dirs/hosts) are:

Forgot to include we've changed this as well:

-partitioner: org.apache.cassandra.dht.Murmur3Partitioner
+partitioner: org.apache.cassandra.dht.RandomPartitioner
 

Cheers,

Mike
--

  Mike Heffner <mike@librato.com>
  Librato, Inc.





--

  Mike Heffner <mike@librato.com>
  Librato, Inc.




--

  Mike Heffner <mike@librato.com>
  Librato, Inc.