Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 70053 invoked from network); 10 Dec 2010 03:41:12 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Dec 2010 03:41:12 -0000 Received: (qmail 66931 invoked by uid 500); 10 Dec 2010 03:41:10 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 66841 invoked by uid 500); 10 Dec 2010 03:41:10 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 66833 invoked by uid 99); 10 Dec 2010 03:41:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Dec 2010 03:41:10 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bill@dehora.net designates 207.7.108.242 as permitted sender) Received: from [207.7.108.242] (HELO chilco.textdrive.com) (207.7.108.242) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Dec 2010 03:41:02 +0000 Received: from [192.168.1.50] (unknown [89.100.172.151]) by chilco.textdrive.com (Postfix) with ESMTP id 5FFFCF11B5 for ; Fri, 10 Dec 2010 03:40:41 +0000 (UTC) Subject: Re: Running multiple instances on a single server --micrandra ?? From: Bill de =?ISO-8859-1?Q?h=D3ra?= Reply-To: bill@dehora.net To: user@cassandra.apache.org In-Reply-To: References: Content-Type: multipart/alternative; boundary="=-X2eJRo7H6O1z8QwFOp/f" Date: Fri, 10 Dec 2010 03:40:38 +0000 Message-ID: <1291952438.25072.127.camel@dehora-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 --=-X2eJRo7H6O1z8QwFOp/f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote: > The idea behind "micrandra" is for a 6 disk system run 6 instances of > Cassandra, one per disk. Use the RackAwareSnitch to make sure no > replicas live on the same node. > > The downsides > 1) we would have to manage 6x the instances of cassandra > 2) we would have some overhead for each JVM. > > The upsides ? > 1) Since disk/instance failure only degrades the overall performance > 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when > down a disk) > 2) Moves and joins have less work to do > 3) Can scale up a single node by adding a single disk to an existing > system (assuming the ram and cpu is light) > 4) OPP would be "easier" to balance out hot spots (maybe not on this > one in not an OPP) > > What does everyone thing? Does it ever make sense to run this way? It might for read heavy loads. When I looked at this, it was pointed out to me it's simpler to run fewer bigger coarser nodes and take the entire node/server out when something goes wrong. Basically give each Cassandra a server. I wonder if it would be better to rethink compaction if that's what's driving the idea. It seems to what is biting everyone, along with GC. Bill --=-X2eJRo7H6O1z8QwFOp/f Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 7bit

On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote:
The idea behind "micrandra" is for a 6 disk system run 6 instances of
Cassandra, one per disk. Use the RackAwareSnitch to make sure no
replicas live on the same node.

The downsides
1) we would have to manage 6x the instances of cassandra
2) we would have some overhead for each JVM.

The upsides ?
1) Since disk/instance failure only degrades the overall performance
1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when
down a disk)
2) Moves and joins have less work to do
3) Can scale up a single node by adding a single disk to an existing
system (assuming the ram and cpu is light)
4) OPP would be "easier" to balance out hot spots (maybe not on this
one in not an OPP)

What does everyone thing? Does it ever make sense to run this way?

It might for read heavy loads.

When I looked at this, it was pointed out to me it's simpler to run fewer bigger coarser nodes and take the entire node/server out when something goes wrong. Basically give each Cassandra a server.

I wonder if it would be better to rethink compaction if that's what's driving the idea. It seems to what is biting everyone, along with GC.

Bill --=-X2eJRo7H6O1z8QwFOp/f--