Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 62AA111AFB for ; Wed, 18 Jun 2014 09:40:02 +0000 (UTC) Received: (qmail 98858 invoked by uid 500); 18 Jun 2014 09:39:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 98822 invoked by uid 500); 18 Jun 2014 09:39:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 98812 invoked by uid 99); 18 Jun 2014 09:39:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jun 2014 09:39:59 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arodrime@gmail.com designates 209.85.217.175 as permitted sender) Received: from [209.85.217.175] (HELO mail-lb0-f175.google.com) (209.85.217.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jun 2014 09:39:54 +0000 Received: by mail-lb0-f175.google.com with SMTP id q8so334989lbi.6 for ; Wed, 18 Jun 2014 02:39:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=P+4lOuQ9C3hY80vgMDPI0ROwPL9s0yg6sONVPuQq3tE=; b=DJqOcAqLRHOCqGlJMooY5DCSrpeG5vKew/cw/ep32zeMXOtrgkBa3pVtBwgS6TEE73 gOL9cXAeLEqSFMUsWGmpbHvymyTlgJ89yGGF5OOX85qerkNtq/jtguNZ7ELtv9SHVg5l C/oX9gfED/p4obnH1bqgfSsp5MNDKSCVPef22YCdTNIyUtTkrJMfHyADnEXmTFL/Yz5h pK9H+D0Ser3coQfaVZHA6LiFDAzymykOAD2g/xIZyCZXDf7py1XsZgyJFoAn9+CcRMwR /3nOaZ4tRaReBkgC6dQOM1zT2rCQvAWx4v26Wr4Jj+b1TYj5CIi68BocVwJ0QNqkT6iK iV1A== X-Received: by 10.152.87.80 with SMTP id v16mr510012laz.77.1403084372820; Wed, 18 Jun 2014 02:39:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.215.106 with HTTP; Wed, 18 Jun 2014 02:39:12 -0700 (PDT) From: Alain RODRIGUEZ Date: Wed, 18 Jun 2014 11:39:12 +0200 Message-ID: Subject: restarting node makes cpu load of the entire cluster to raise To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11c241f4a1c63c04fc1909db X-Virus-Checked: Checked by ClamAV on apache.org --001a11c241f4a1c63c04fc1909db Content-Type: text/plain; charset=ISO-8859-1 Hi guys Using 1.2.11, when I try to rolling restart the cluster, any node I restart makes the whole cluster cpu load to increase, reaching a "red" state in opscenter (load from 3-4 to 20+). This happens once the node is back online. The restarted node uses 100 % cpu for 5 - 10 min and sometimes drop mutations. I have tried to throttle handoff to 256 (instead of 1024), yet it doesn't seems to help that much. Disks are not the bottleneck. PARNEW GC increase a bit, but nothing problematic I think. Basically, what could be happening on node restart ? What is taking that much CPU on every machine ? There is no steal or iowait. What can I try to tune ? --001a11c241f4a1c63c04fc1909db Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi guys

Using 1.2.11, when I try to rol= ling restart the cluster, any node I restart makes the whole cluster cpu lo= ad to increase, reaching a "red" state in opscenter (load from 3-= 4 to 20+). This happens once the node is back online.

The restarted node uses 100 % cpu for 5 - 10 min and so= metimes drop mutations.

I have tried to throttle h= andoff to 256 (instead of 1024), yet it doesn't seems to help that much= .

Disks are not the bottleneck. PARNEW GC increase a bit,= but nothing problematic I think.

Basically, what = could be happening on node restart ? What is taking that much CPU on every = machine ? There is no steal or iowait.

What can I try to tune ?

--001a11c241f4a1c63c04fc1909db--