Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 681381080F for ; Tue, 2 Jul 2013 19:43:40 +0000 (UTC) Received: (qmail 65822 invoked by uid 500); 2 Jul 2013 19:43:37 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65794 invoked by uid 500); 2 Jul 2013 19:43:37 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65786 invoked by uid 99); 2 Jul 2013 19:43:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2013 19:43:37 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: unknown (athena.apache.org: error in processing during lookup of mike@librato.com) Received: from [209.85.215.48] (HELO mail-la0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2013 19:43:32 +0000 Received: by mail-la0-f48.google.com with SMTP id lx15so6002432lab.35 for ; Tue, 02 Jul 2013 12:43:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=isMEGCpFEhjoKqwdfDI0YQdV92wNsRl/AJ1cHQofYg8=; b=iNCqJd42wXdBAx5K29nnZrPbCJ72cfaKxw18gzYh+7HnxRXAD0z+WZ4W48Via1wBCd +Ml5tDIM+lfEjRVnneP2zTFa1Xmc0JuxwBcjtgGp5/0xkwKPmhoFhimRD07wteodhM+N B6iKqalPqjsvhozAHg2zLKv8pI72zTgmiVvn0L/Rt69qVIzxYMfoRuG3x80u1kfpi8cv o5Lm5zlZpgDROf0OZsaPPgSogb3oL5mIQu4tovdkK3EtNHk4qfKnBJa3xzNgj+RBdNfS zhkard+gpvs1jzuni63sRuoNDlXtyKFqC+AibUAvkTP20SF2Kz3hCGHcEi0IDtONA86c /w7A== MIME-Version: 1.0 X-Received: by 10.112.16.105 with SMTP id f9mr14407269lbd.69.1372794190798; Tue, 02 Jul 2013 12:43:10 -0700 (PDT) Received: by 10.112.182.67 with HTTP; Tue, 2 Jul 2013 12:43:10 -0700 (PDT) X-Originating-IP: [71.62.120.4] In-Reply-To: References: Date: Tue, 2 Jul 2013 15:43:10 -0400 Message-ID: Subject: Re: Streaming performance with 1.2.6 From: Mike Heffner To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a1133d02617b8c604e08c8e7d X-Gm-Message-State: ALoCoQn7bEHumcfGT+cqpI9mV1wTdgceqcaRt/Z09u6CkiVTEobrx13FK14qzNuPYKuXcxMJTolg X-Virus-Checked: Checked by ClamAV on apache.org --001a1133d02617b8c604e08c8e7d Content-Type: text/plain; charset=ISO-8859-1 As a test, adding a 7th node in the first AZ will stream from both the two existing nodes in the same AZ. Aggregate streaming bandwidth at the 7th node is approximately 12 MB/sec when all limits are set at 800 MB/sec, or about double what I saw streaming from a single node. This would seem to indicate that the sending node is limiting our streaming rate. Mike On Tue, Jul 2, 2013 at 3:00 PM, Mike Heffner wrote: > Sankalp, > > Parallel sstableloader streaming would definitely be valuable. > > However, this ring is currently using vnodes and I was surprised to see > that a bootstrapping node only streamed from one node in the ring. My > understanding was that a bootstrapping node would stream from multiple > nodes in the ring. > > We started with a 3 node/3 AZ, RF=3 ring. We then increased that to 6 > nodes, adding one per AZ. The 4th, 5th and 6th nodes only streamed from the > node in their own AZ/rack which led to the serial sstable streaming. Is > this the correct behavior for the snitch? Is there an option to stream from > multiple replicas across the az/rack configuration? > > Mike > > > On Tue, Jul 2, 2013 at 1:53 PM, sankalp kohli wrote: > >> This was a problem pre vnodes. I had several JIRA for that but some of >> them were voted down saying the performance will improve with vnodes. >> The main problem is that it streams one sstable at a time and not in >> parallel. >> >> Jira 4784 can speed up the bootstrap performance. You can also do a zero >> copy and not touch the caches of the nodes which are contributing in the >> build. >> >> >> https://issues.apache.org/jira/browse/CASSANDRA-4663 >> https://issues.apache.org/jira/browse/CASSANDRA-4784 >> >> >> On Tue, Jul 2, 2013 at 7:35 AM, Mike Heffner wrote: >> >>> >>> On Mon, Jul 1, 2013 at 10:06 PM, Mike Heffner wrote: >>> >>>> >>>> The only changes we've made to the config (aside from dirs/hosts) are: >>>> >>> >>> Forgot to include we've changed this as well: >>> >>> -partitioner: org.apache.cassandra.dht.Murmur3Partitioner >>> +partitioner: org.apache.cassandra.dht.RandomPartitioner >>> >>> >>> Cheers, >>> >>> Mike >>> -- >>> >>> Mike Heffner >>> Librato, Inc. >>> >>> >> > > > -- > > Mike Heffner > Librato, Inc. > > -- Mike Heffner Librato, Inc. --001a1133d02617b8c604e08c8e7d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
As a test, adding a 7th node in the first AZ will stream f= rom both the two existing nodes in the same AZ.

Aggregat= e streaming bandwidth at the 7th node is approximately 12 MB/sec when all l= imits are set at 800 MB/sec, or about double what I saw streaming from a si= ngle node. This would seem to indicate that the sending node is limiting ou= r streaming rate.

Mike


On Tue, Jul 2, 2013 at 3:00 PM, Mike Heffner <mike@l= ibrato.com> wrote:
Sankalp,

Parallel sstableloader streaming would definitely be valuable.
<= br>
However, this ring is currently using vnodes and I was surprised= to see that a bootstrapping node only streamed from one node in the ring. = My understanding was that a bootstrapping node would stream from multiple n= odes in the ring.

We started with a 3 node/3 AZ, RF=3D3 ring. We then inc= reased that to 6 nodes, adding one per AZ. The 4th, 5th and 6th nodes only = streamed from the node in their own AZ/rack which led to the serial sstable= streaming. Is this the correct behavior for the snitch? Is there an option= to stream from multiple replicas across the az/rack configuration?

Mike


On Tue, Jul 2, 2013 at 1:53 PM, sankalp kohli <kohlisankalp@gmail.c= om> wrote:
This was a problem pre vnod= es. I had several JIRA for that but some of them were voted down saying the= performance will improve with vnodes.=A0
The main problem is that it streams one sstable at a time and not in parall= el.=A0

Jira 4784 can speed up the bootstrap performance. You c= an also do a zero copy and not touch the caches of the nodes which are cont= ributing in the build.=A0




On Tue, Jul 2, 2013 at 7:35 AM, Mike Heffner <= ;mike@librato.com= > wrote:

On Mon, Jul 1, 2013 at 10:06 PM, Mike H= effner <mike@librato.com> wrote:

The only changes= we've made to the config (aside from dirs/hosts) are:

Forgot to include we'= ;ve changed this as well:

-partitioner: org.apache= .cassandra.dht.Murmur3Partitioner
+partitioner: org.apache.cassan= dra.dht.RandomPartitioner
=A0

Cheers,

Mike
--

=A0=A0Mike Heffner <mike@librato.com>
=A0=A0= Librato, Inc.





--
=

=A0=A0Mike Heffner <mike@librato.com>
=A0=A0Librato, Inc.




--
=

=A0=A0Mike Heffner <mike@librato.com>
=A0=A0Librato, Inc.

--001a1133d02617b8c604e08c8e7d--