Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 488A45AC3 for ; Thu, 12 May 2011 07:50:37 +0000 (UTC) Received: (qmail 10814 invoked by uid 500); 12 May 2011 07:50:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 10554 invoked by uid 500); 12 May 2011 07:50:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 10546 invoked by uid 99); 12 May 2011 07:50:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 May 2011 07:50:30 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tomerbd1@gmail.com designates 209.85.216.44 as permitted sender) Received: from [209.85.216.44] (HELO mail-qw0-f44.google.com) (209.85.216.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 May 2011 07:50:23 +0000 Received: by qwc23 with SMTP id 23so845568qwc.31 for ; Thu, 12 May 2011 00:50:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=KE+FycJpFwSyrlH4rz+rfHKpGLHJmFauFlKBi5C7+r4=; b=CwGI02ZjOLF9NSq7+delBqiSbHaHa9CVIrzCh5aCgGmDpeXPSVyyxFt3B+0/knwYE9 P3AarU2GJ+4qNz6bdlwC1HmjIfdhQHAuGusRjUy8NTCitk4UvazMlUfodBYJQ/k9ocwt H0L3jf5PdR+kOeRs/63Rig/TkuKCJ7nJgNGFc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type :content-transfer-encoding; b=S/lYWUVPWYS1m1ba+zesvpMfxosjfcdHuOzY4xNT7JbNX37fgG92abXIXeSlirEFBT LMMXbVW3JXb91hPVMZkibqZHwAY7wG4d2VDYGgjJ5GaGJvcA8HAsCGltH1gIfzcxzEg2 sxaNRp41ZN60WZRYWu0YwwOlxp4V0T5yXdiyQ= Received: by 10.229.77.12 with SMTP id e12mr7991942qck.147.1305186602240; Thu, 12 May 2011 00:50:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.181.5 with HTTP; Thu, 12 May 2011 00:49:22 -0700 (PDT) From: Tomer B Date: Thu, 12 May 2011 10:49:22 +0300 Message-ID: Subject: Knowing when there is a *real* need to add nodes To: user@cassandra.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi I'm trying to predict when my cluster would soon be needing new nodes added, i want a continuous graph telling my of my cluster health so that when i see my cluster becomes more and more busy (I want numbers & measurments) i would be able to know i need to start purchasing more machines and get them into my cluster, so i want to know of that beforehand. I'm writing here what I came with after doing some research over net. I would highly appreciate any additional gauge measurements and ranges in order to test my cluster health and to know beforehand when i'm going to soon need more nodes.Although i'm writing down green gauge,yellow gauge,red gauge, i'm also trying to find a continuous graph where i can tell where our cluster stand (as much as possible...) Also my recommendation is always before adding new nodes: 1. Make sure all nodes are balanced and if not balance them. 2. Separate commit log drive from data (SSTables) drive 3. use mmap index only in memory and not auto 4. Increase disk IO if possible. 5. Avoid swapping as much as possible. As for my gauge tests for when to add new nodes: test: nodetool tpstats -h green gauge: No pending column with number higher yellow gauge: pending columns 100-2000 red gauge:Larger than 3000 test: iostat -x -n -p -z 5 10 and iostat -xcn 5 green gauge: kw/s + kr/s reaches is below 25% capacity of disk io yellow gauge: 20%-50% red gauge: 50%+ test: ostat -x -n -p -z 5 10 and check %b column green gauge: less than 10% yellow gauge: 10%-80% red gauge: 90%+ test: nodetool cfstats --host localhost green gauge: =93SSTable count=94 item does not continually grow over time yellow gauge: red gauge: =93SSTable count=94 item continually grows over time test: ./nodetool cfstats --host localhost | grep -i pending green gauge: 0-2 yellow gauge: 3-100 red gauge: 101+ I would highly appreciate any additional gauge measurements and ranges in order to test my cluster health and to know ***beforehand*** when i'm going to soon need more nodes.