Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com
 designates 209.85.161.43 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=nQ1enaywc+/F38oIBs8NBp93kGbJZ0Am4Hpdk3mmqg6WsEV9TZJWnAjKbGDyU2KSZr
         2C073WFuSpT9hjsgkxjsKRJGIVIANh2KT69HTNsF5YVF/WwsF03Ww7tmIac+rJd+T/Du
         RQ+/RUhDVS5aLuFDIbVR8CVbi0Bmp2sljS0vM=
MIME-Version: 1.0
In-Reply-To: <AANLkTi=MOsO0GTJtcyFUxDnk3Gms4PfrCZKPCXeXkSiD@mail.gmail.com>
References: <AANLkTimdNrvCSbRTP+gwKi=tieXiV5ZWbJOc1m5n_ehX@mail.gmail.com>
	<1291952438.25072.127.camel@dehora-laptop>
	<AANLkTi=MOsO0GTJtcyFUxDnk3Gms4PfrCZKPCXeXkSiD@mail.gmail.com>
Date: Sat, 11 Dec 2010 00:13:19 -0500
Message-ID: <AANLkTi=BBnzwqr-W+56Or83EQi=iauGfPDrGMrN9m-ZM@mail.gmail.com>
Subject: Re: Running multiple instances on a single server --micrandra ??
From: Edward Capriolo <edlinuxguru@gmail.com>
To: user@cassandra.apache.org, bill@dehora.net
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Fri, Dec 10, 2010 at 11:39 PM, Edward Capriolo <edlinuxguru@gmail.com> w=
rote:
> On Thu, Dec 9, 2010 at 10:40 PM, Bill de h=D3ra <bill@dehora.net> wrote:
>>
>>
>> On Tue, 2010-12-07 at 21:25 -0500, Edward Capriolo wrote:
>>
>> The idea behind "micrandra" is for a 6 disk system run 6 instances of
>> Cassandra, one per disk. Use the RackAwareSnitch to make sure no
>> replicas live on the same node.
>>
>> The downsides
>> 1) we would have to manage 6x the instances of cassandra
>> 2) we would have some overhead for each JVM.
>>
>> The upsides ?
>> 1) Since disk/instance failure only degrades the overall performance
>> 1/6th (RAID0 you lost the entire node) (RAID5 still takes a hit when
>> down a disk)
>> 2) Moves and joins have less work to do
>> 3) Can scale up a single node by adding a single disk to an existing
>> system (assuming the ram and cpu is light)
>> 4) OPP would be "easier" to balance out hot spots (maybe not on this
>> one in not an OPP)
>>
>> What does everyone thing? Does it ever make sense to run this way?
>>
>> It might for read heavy loads.
>>
>> When I looked at this, it was pointed out to me it's simpler to run fewe=
r
>> bigger coarser nodes and take the entire node/server out when something =
goes
>> wrong. Basically give each Cassandra a server.
>>
>> I wonder if it would be better to rethink compaction if that's what's
>> driving the idea. It seems to what is biting everyone, along with GC.
>>
>> Bill
>
> Having 6 IP's on a machine would be a given in this setup. That is not
> an issue for me.
>
> It is not "biting" me. We all know that going from 10-20 nodes is
> pretty simple. However organic growth from 10-16, then a couple months
> later from 16 - 22, can take some effort with 300-600 GB per node,
> since each join and clean up can take a while. I am wondering if
> dividing a single large node into multiple smaller instances would
> make this type of growth easier.
>

To clearly explain the scenario. 5 nodes cluster each node has 20 %
ring. They each have 6 disks. ~ 200 GB data.
Going to 10 nodes is easy. You can join each one directly between each node=
.

However if you are going from say 5 -> 8. This gets dicey. Do you
calculate the ideal ring position for 10 nodes?
20% | 20% | 10% | 10% | 10% | 10% | 10% | 10%  This results in three
joins and several clean ups. With this choice you save time but hope
you do not get to the point where the first two nodes get overloaded.

If you decide to work with the ideal tokens for 8 you have many moves
joins. Until we have:

https://issues.apache.org/jira/browse/CASSANDRA-1418
https://issues.apache.org/jira/browse/CASSANDRA-1427

Having 6 smaller instances on a node with 6 disks. Would make it
easier to keep close to balanced without having to double your cluster
size each time you grow or doing a series of moves to get balanced
again.