Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 6655 invoked from network); 25 May 2010 05:02:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 May 2010 05:02:40 -0000 Received: (qmail 42222 invoked by uid 500); 25 May 2010 05:02:39 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42204 invoked by uid 500); 25 May 2010 05:02:39 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42196 invoked by uid 99); 25 May 2010 05:02:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 May 2010 05:02:39 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-wy0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 May 2010 05:02:34 +0000 Received: by wye20 with SMTP id 20so1125349wye.31 for ; Mon, 24 May 2010 22:02:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=+508T3zyOglt8B9zSAR/DtKG/sv7IAu+oq+LVL/vYZ4=; b=OJtMSaRa/K1oYyxrjWvxouY3bKpUzOQkfgWDT5wlzsEaHDIka09C7GHWXLgJAWyjdt gISSTg9oJ8yODbaZwuUo0cMox/SQFaMfR6fNW9NEk3Ip1dcfW7Bf6/c69e1zbigLZOUr 0xVW28k52krflD9hEij7oS6hw1MEgRFfUoHY0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=OuK/Q7eTIdBdL0js1kRHiPMcBUfeaVfF+DSx2z5WA2uKvg/qnqrTLiiU9CEL5cAKUY RVlVCmAz/PRHxvuUZ3IzXl/XriqAp2/SgAeX8ScIewBkqxyTOceHDfV0c6bYECoCaUzE kOkZcT5Ft6nYqnZcdIaFBW76DeNQXFxTw7Sh4= Received: by 10.216.155.65 with SMTP id i43mr4030449wek.98.1274763730244; Mon, 24 May 2010 22:02:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.17.197 with HTTP; Mon, 24 May 2010 22:01:50 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Tue, 25 May 2010 00:01:50 -0500 Message-ID: Subject: Re: Ideal configuration for given hardware To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable yes, I would do raid1 on 2 commitlog disks and raid10 on the 6 remaining for OS + data On Mon, May 24, 2010 at 2:27 PM, Aaron McCurry wrote: > Thanks, a lot! =A0So for RAID 10, is the thought that the node can surviv= e a > single disk failure and keep going until a normal maintain cycle? =A0Also= are > you saying that you would configure a single RAID 10 for the whole box? = =A0OS > included? =A0I have 8 x 500 Gig drives, so that would leave me with 2T pe= r > box, which I think is fine. =A0But I do have one question, in this > configuration would commit log writing and data directory > compaction=A0interfere=A0with one another? =A0Just based on what I read, = it seems > as though you want at least disks/partitions, one for commit log and one = for > data. =A0Thanks again for the feedback! > Aaron > > > On Mon, May 24, 2010 at 3:12 PM, Ian Soboroff wrote= : >> >> My data disks on two of my nodes are RAID-5, just because of >> circumstances.=A0 My other nodes are JBOD.=A0 I don't notice any real >> difference, but I haven't strongly benched it. >> Ian >> >> On Mon, May 24, 2010 at 2:45 PM, Jonathan Ellis wrot= e: >>> >>> I can think of at least 2 clusters running 32GB boxes with single >>> Cassandra processes on each. =A0(16 seems to be more common.) =A0At 64 = I >>> would seriously consider multiple processes per machine. =A0You'd want >>> to configure a Snitch such that same-machine boxes were considered the >>> same rack, there is no separate closeness level of same machine. >>> >>> At 32 I think you're fine with one process. =A0Watch for latency spikes >>> and see how it goes. >>> >>> I would run raid 10 on the data disks if you can afford giving up the >>> space, otherwise raid0. =A0I don't know that anyone's tested raid5. >>> >>> On Sun, May 23, 2010 at 3:30 PM, Aaron McCurry >>> wrote: >>> > I am planning on setting up a Cassandra cluster on a small 16 node >>> > cluster >>> > (possibly 32 way). =A0Each machine has 8 cores 32 Gig of ram and 8 hd= s. >>> > =A0My >>> > first thought is to setup one of those hds for the commit log, 6 for >>> > data >>> > and leave one for the OS. =A0However I do have a concern about best >>> > utilizing >>> > my memory, should I run a larger heap? =A0Should I run several cassan= dra >>> > processes on the same box? >>> > My concern about the larger heap is because GC's typically get slower= . >>> > =A0And >>> > if I run several procs, does cassandra realize that it's the same box >>> > for >>> > replication=A0purposes? >>> > I do have other hd conf options, hardware RAID 0,1,or 5. >>> > Just looking for some general configuration options as well as some >>> > real >>> > world successes with similarly sized hardware. =A0Thanks! >>> > Aaron >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of Riptano, the source for professional Cassandra support >>> http://riptano.com >> > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com