Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1257)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_6C3746A0-22D5-45EB-9144-5B5C45BB2D3B"
Subject: Re: One or Two clusters?
Date: Tue, 27 Mar 2012 06:47:39 +1300
In-Reply-To: <58E6BABE-FCB3-4FD7-BBF8-2A5BDF5C48BA@cloudorange.com>
To: user@cassandra.apache.org
References: <58E6BABE-FCB3-4FD7-BBF8-2A5BDF5C48BA@cloudorange.com>
Message-Id: <6917B74E-564C-4964-86A4-ED20AD18FEC8@thelastpickle.com>


--Apple-Mail=_6C3746A0-22D5-45EB-9144-5B5C45BB2D3B
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Use one cluster. Use lots-o-machines.

The read and write paths do not directly  interfere with each other like =
they do in a RDBMS. Compaction created by writes can suck up disk IO, =
but this is throttled so in practice it is not such a big problem. =
Excessive GC created by reads or compaction may slow down the server, =
but you will want to avoid them anyway.

The one caveat is: it depends on how you are transforming the data. If =
you have a are using Hadoop consider creating a single cluster with =
multiple DC's (like Data Stax do). One for OLTP and one for OLAP, do the =
hadoop work in the OLAP DC and have the online app read-write to the =
OLTP one.=20

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/03/2012, at 3:22 AM, Oleg Proudnikov wrote:

> Hi,
>=20
> Could someone please help me understand the benefits of having a =
single large cluster vs. having two smaller clusters separated by the =
pattern of use? One, MOSTLY WRITE cluster could incrementally accumulate =
large amounts of data throughout the day. The daily increment would be =
processed, summarized and stored into the second READ cluster at night. =
Users would only need to interact with the READ portion of the overall =
system mostly during the day. Writes would be spread throughout the day =
and will be a function of user activity with some bulk load activity =
from time to time.  WRITE portion of the database would be an order of =
magnitude larger than the READ portion. READ portion would have an an =
order of magnitude higher traffic except during periodic bulk loads.
>=20
> On one hand, If I were to have a single cluster I would have more  =
resources for the users and potentially better scalability. A single =
cluster may need fewer servers overall, provided write activity does not =
affect reads... On the other hand, write activity and associated memory =
consumption, GC, as well as maintenance riutines may affect READ system. =
The system will be hosted on EC2.
>=20
> I would appreciate any thoughts.
>=20
> Regards,
> Oleg


--Apple-Mail=_6C3746A0-22D5-45EB-9144-5B5C45BB2D3B
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Use =
one cluster. Use lots-o-machines.<div><br></div><div>The read and write =
paths do not&nbsp;directly &nbsp;interfere with each other like they do =
in a RDBMS. Compaction created by writes can suck up disk IO, but this =
is throttled so in practice it is not such a big problem. Excessive GC =
created by reads or compaction may slow down the server, but you will =
want to avoid them anyway.</div><div><br></div><div>The one caveat is: =
it depends on how you are transforming the data. If you have a are using =
Hadoop consider creating a single cluster with multiple DC's (like Data =
Stax do). One for OLTP and one for OLAP, do the hadoop work in the OLAP =
DC and have the online app read-write to the OLTP =
one.&nbsp;</div><div><br></div><div>Cheers</div><div><br><div =
apple-content-edited=3D"true">
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: =
0px; text-transform: none; white-space: normal; widows: 2; word-spacing: =
0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: =
normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: =
0px; -webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><span class=3D"Apple-style-span" =
style=3D"border-collapse: separate; color: rgb(0, 0, 0); font-family: =
Helvetica; font-style: normal; font-variant: normal; font-weight: =
normal; letter-spacing: normal; line-height: normal; orphans: 2; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------</div><div>Aaron Morton</div><div>Freelance =
Developer</div><div>@aaronmorton</div><div><a =
href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com</a></di=
v></div></div></span></div></span></div></span></span>
</div>

<br><div><div>On 27/03/2012, at 3:22 AM, Oleg Proudnikov wrote:</div><br =
class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div>Hi,<br><br>Could someone please help me understand =
the benefits of having a single large cluster vs. having two smaller =
clusters separated by the pattern of use? One, MOSTLY WRITE cluster =
could incrementally accumulate large amounts of data throughout the day. =
The daily increment would be processed, summarized and stored into the =
second READ cluster at night. Users would only need to interact with the =
READ portion of the overall system mostly during the day. Writes would =
be spread throughout the day and will be a function of user activity =
with some bulk load activity from time to time. &nbsp;WRITE portion of =
the database would be an order of magnitude larger than the READ =
portion. READ portion would have an an order of magnitude higher traffic =
except during periodic bulk loads.<br><br>On one hand, If I were to have =
a single cluster I would have more &nbsp;resources for the users and =
potentially better scalability. A single cluster may need fewer servers =
overall, provided write activity does not affect reads... On the other =
hand, write activity and associated memory consumption, GC, as well as =
maintenance riutines may affect READ system. The system will be hosted =
on EC2.<br><br>I would appreciate any =
thoughts.<br><br>Regards,<br>Oleg<br></div></blockquote></div><br></div></=
body></html>=

--Apple-Mail=_6C3746A0-22D5-45EB-9144-5B5C45BB2D3B--