Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 21048 invoked from network); 11 Dec 2010 05:13:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 Dec 2010 05:13:49 -0000 Received: (qmail 32642 invoked by uid 500); 11 Dec 2010 05:13:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 32549 invoked by uid 500); 11 Dec 2010 05:13:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32539 invoked by uid 99); 11 Dec 2010 05:13:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Dec 2010 05:13:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.161.43 as permitted sender) Received: from [209.85.161.43] (HELO mail-fx0-f43.google.com) (209.85.161.43) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Dec 2010 05:13:41 +0000 Received: by fxm18 with SMTP id 18so4575364fxm.30 for ; Fri, 10 Dec 2010 21:13:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=yolch8FeH1XQru4wm0yf2W5SrwT8Hh/JWB7SqwNRUK8=; b=psaITcUxiLPHvwCimE6LYxrUhwQx2isYaJMSd4hD1FHV5JcyM2QdLdp+veKiLgKTx8 mjz3L/RThtQWlOJm1323kW1AMkmZtvLQWEt2ZwV3q0GrLwNjpZOen4pg4x0383KryZmw pGGbdR+n1gneGvtufewr57vAjFhr0FsNMBKZM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=nQ1enaywc+/F38oIBs8NBp93kGbJZ0Am4Hpdk3mmqg6WsEV9TZJWnAjKbGDyU2KSZr 2C073WFuSpT9hjsgkxjsKRJGIVIANh2KT69HTNsF5YVF/WwsF03Ww7tmIac+rJd+T/Du RQ+/RUhDVS5aLuFDIbVR8CVbi0Bmp2sljS0vM= MIME-Version: 1.0 Received: by 10.223.112.1 with SMTP id u1mr1709376fap.109.1292044399843; Fri, 10 Dec 2010 21:13:19 -0800 (PST) Received: by 10.223.21.21 with HTTP; Fri, 10 Dec 2010 21:13:19 -0800 (PST) In-Reply-To: References: <1291952438.25072.127.camel@dehora-laptop> Date: Sat, 11 Dec 2010 00:13:19 -0500 Message-ID: Subject: Re: Running multiple instances on a single server --micrandra ?? From: Edward Capriolo To: user@cassandra.apache.org, bill@dehora.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Dec 10, 2010 at 11:39 PM, Edward Capriolo w= rote: > On Thu, Dec 9, 2010 at 10:40 PM, Bill de h=D3ra wrote: >> >> >> On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote: >> >> The idea behind "micrandra" is for a 6 disk system run 6 instances of >> Cassandra, one per disk. Use the RackAwareSnitch to make sure no >> replicas live on the same node. >> >> The downsides >> 1) we would have to manage 6x the instances of cassandra >> 2) we would have some overhead for each JVM. >> >> The upsides ? >> 1) Since disk/instance failure only degrades the overall performance >> 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when >> down a disk) >> 2) Moves and joins have less work to do >> 3) Can scale up a single node by adding a single disk to an existing >> system (assuming the ram and cpu is light) >> 4) OPP would be "easier" to balance out hot spots (maybe not on this >> one in not an OPP) >> >> What does everyone thing? Does it ever make sense to run this way? >> >> It might for read heavy loads. >> >> When I looked at this, it was pointed out to me it's simpler to run fewe= r >> bigger coarser nodes and take the entire node/server out when something = goes >> wrong. Basically give each Cassandra a server. >> >> I wonder if it would be better to rethink compaction if that's what's >> driving the idea. It seems to what is biting everyone, along with GC. >> >> Bill > > Having 6 IP's on a machine would be a given in this setup. That is not > an issue for me. > > It is not "biting" me. We all know that going from 10-20 nodes is > pretty simple. However organic growth from 10-16, then a couple months > later from 16 - 22, can take some effort with 300-600 GB per node, > since each join and clean up can take a while. I am wondering if > dividing a single large node into multiple smaller instances would > make this type of growth easier. > To clearly explain the scenario. 5 nodes cluster each node has 20 % ring. They each have 6 disks. ~ 200 GB data. Going to 10 nodes is easy. You can join each one directly between each node= . However if you are going from say 5 -> 8. This gets dicey. Do you calculate the ideal ring position for 10 nodes? 20% | 20% | 10% | 10% | 10% | 10% | 10% | 10% This results in three joins and several clean ups. With this choice you save time but hope you do not get to the point where the first two nodes get overloaded. If you decide to work with the ideal tokens for 8 you have many moves joins. Until we have: https://issues.apache.org/jira/browse/CASSANDRA-1418 https://issues.apache.org/jira/browse/CASSANDRA-1427 Having 6 smaller instances on a node with 6 disks. Would make it easier to keep close to balanced without having to double your cluster size each time you grow or doing a series of moves to get balanced again.