Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 702C01055E for ; Tue, 15 Sep 2015 01:59:22 +0000 (UTC) Received: (qmail 48513 invoked by uid 500); 15 Sep 2015 01:59:18 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 48476 invoked by uid 500); 15 Sep 2015 01:59:18 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 48464 invoked by uid 99); 15 Sep 2015 01:59:17 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Sep 2015 01:59:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id F392E1A1A63 for ; Tue, 15 Sep 2015 01:59:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.67 X-Spam-Level: **** X-Spam-Status: No, score=4.67 tagged_above=-999 required=6.31 tests=[DC_IMAGE_SPAM_HTML=0.141, DC_IMAGE_SPAM_TEXT=0.123, DC_PNG_UNO_LARGO=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_IMAGE_ONLY_20=0.7, HTML_IMAGE_RATIO_02=0.805, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id eea6vVngwgiJ for ; Tue, 15 Sep 2015 01:59:08 +0000 (UTC) Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com [209.85.217.170]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 745BC24B20 for ; Tue, 15 Sep 2015 01:59:03 +0000 (UTC) Received: by lbcao8 with SMTP id ao8so76693257lbc.3 for ; Mon, 14 Sep 2015 18:59:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=cEIq69Dt3sr7gtJrARJY513PnZT3EcIlhBW82lxY/JE=; b=EulFxRrSw4FHQbNZZde4rceu6aXUxwX+jRXRaj82LzGhj81BzgmkU23k3U19wXIA17 67ZvHB3xpWDABfH5R0T99StEK0jDrCLW4mTr8YfC6pHIHE4srSQi7Y95B7J45PoJBcQu YWDP1NLK+trLhEchWHIx5XQPfhqqBTplj0iKdjpS2q/MWzo2eUijVPDtMzOPYEGtV3P1 0NdJUrgtRQCn38H8VSMxfGPV5/Nt4GZOdqSlHhvg94x85uhrcUhiEDq6kUwSQ2kUQ5Uu vBr7dPIDZt+IrR85Mf/GPR+vqF8C/Y5NeMewjIwPtJxK/n6za0tscZJpcpGYk/qErpbf frVw== MIME-Version: 1.0 X-Received: by 10.152.21.74 with SMTP id t10mr567215lae.107.1442282342390; Mon, 14 Sep 2015 18:59:02 -0700 (PDT) Received: by 10.25.197.213 with HTTP; Mon, 14 Sep 2015 18:59:02 -0700 (PDT) Date: Mon, 14 Sep 2015 18:59:02 -0700 Message-ID: Subject: Source code for bootstrap process From: Aadil Ahamed To: dev@cassandra.apache.org Content-Type: multipart/related; boundary=089e01494164b2fbf2051fbf862b --089e01494164b2fbf2051fbf862b Content-Type: multipart/alternative; boundary=089e01494164b2fbee051fbf862a --089e01494164b2fbee051fbf862a Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I am interested in understanding the quantitative effect of adding a node to a Cassandra cluster. From my tests I have observed that the CPU usage on an added node is relatively high when data is being streamed in to it while bootstrapping. For example, here is a graph for a node that was added to a 4 node cluster (I can insert images right?): =E2=80=8B As you can see, the cpu usage stays pretty high while data is streamed into the node. I attached a JVM profiler: https://github.com/aragozin/jvm-tools (recommended by one of the members of the Cassandra user mailing list) to try and figure out what exactly was taking up so much cpu. Here is a snapshot of the profiler when cpu usage was high: =E2=80=8BThe threads "STREAM-IN-/IP_ADDRESS" seem to be the most cpu intens= ive. Why are these threads so cpu intensive? Can you point me to the source code where these threads are implemented? (My Cassandra Version is 2.1.7) --=20 Thank you, Aadil Ahamed --089e01494164b2fbee051fbf862a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
I am interested in understanding the quantitative = effect of adding a node to a Cassandra cluster. From my tests I have observ= ed that the CPU usage on an added node is relatively high when data is bein= g streamed in to it while bootstrapping.

For examp= le, here is a graph for a node that was added to a 4 node cluster (I can in= sert images right?):


=E2=80=8B
= As you can see, the cpu usage stays pretty high while data is streamed into= the node.

I attached a JVM profiler:=C2=A0https://github.com/aragozin/jvm-tools=C2=A0(recommended by one of the members of the Cassandra user ma= iling list) to try and figure out what exactly was taking up so much cpu.

Her= e is a snapshot of the profiler when cpu usage was high:
<= span class=3D"">
=E2=80=8BThe threads "STREAM-IN-/IP_ADDRESS"= seem to be the most cpu intensive.=C2=A0
Why are these threads so cpu intensive?
= Can you point me to the source code where these threads are implemented?

(My Cassandra Version is 2.1.7)

--

Thank you,
Aadil= Ahamed
--089e01494164b2fbee051fbf862a-- --089e01494164b2fbf2051fbf862b--