Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: error (athena.apache.org: local policy)
Date: Tue, 17 Mar 2015 03:06:55 +0000 (UTC)
From: Anuj Wadehra <anujw_2003@yahoo.co.in>
Reply-To: Anuj Wadehra <anujw_2003@yahoo.co.in>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Message-ID: <663448416.574648.1426561615202.JavaMail.yahoo@mail.yahoo.com>
In-Reply-To: <1426558849.87177.YahooMailMobile@web192904.mail.sg3.yahoo.com>
References: <1426558849.87177.YahooMailMobile@web192904.mail.sg3.yahoo.com>
Subject: Run Mixed Workload using two instances on one  node
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_574647_1333801493.1426561615198"

------=_Part_574647_1333801493.1426561615198
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable


 Hi,

We are trying to Decouple our Reporting DB from OLTP. Need urgent help on t=
he feasibility of proposed solution for PRODUCTION.

Use Case: Currently, our OLTP and Reporting application and DB are same. So=
me CF are used for both OLTP and Reporting while others are solely used for=
 Reporting.Every business transaction synchronously updates the main OLTP C=
F and asynchronously updates other Reporting CFs.

Problem Statement:
1. Decouple Reporting and OLTP such that Reporting load can't impact=C2=A0 =
OLTP performance.
2. Scaling of Reporting=C2=A0 and OLTP modules must be independent
3. OLTP client should not update all Reporting CFs. We generate Data Record=
s on File sytem/shared disk.Reporting should use these Records to create Re=
porting DB.
4. Small customers may do OLTP and Reporting on same 3-node cluster. Bigger=
 customers can be given an option to have dedicated OLTP and Reporting node=
s. So, standard Hardware box should be usable for 3 deployments (OLTP,Repor=
ting or OLTP+Reporting)

Note: Reporting is ad-hoc, may involve full table scans and does not involv=
e Analytics. Data size is huge 2TB (OLTP+Reporting) per node.

Hardware : Standard deployment -3 node cluster with each node having 24 cor=
es, 64GB RAM, 400GB * 6 SSDs in RAID5

Proposed Solution:
1. Split OLTP and Reporting clients into two application components.
2. For small deployments where more than 3 nodes are not required:
=C2=A0 =C2=A0 A. Install 2 Cassandra instances on each node one for OLTP an=
d other for Reporting
=C2=A0 =C2=A0 B. To distribute I/O load in 2:1 --Remove RAID5 (as Cassandra=
 offers replication) and assign 4 disks as JBod for OLTP and 2 disks for Re=
porting
=C2=A0 =C2=A0 C. RAM is abundant and often under-utilized , so assign 8GB e=
ach for 2 Cassandra instance
=C2=A0 =C2=A0 D. To make sure that Reporting is not able to overload CPU, t=
une concurrent_reads,concurrent_writes=20
 OLTP client will only write to OLTP DB and generate DB record. Reporting c=
lient will poll FS and populate Reporting DB in required format.
3. Larger customers can have Reporting clients and DB on dedicated physical=
 nodes with all resources.

Key Questions:
Is it ok to run 2 Cassandra instances on one node in Production system and =
limit CPU Usage,Disk I/O and RAM as suggested above?
Any other solution for above mentioned problem statement?


Thanks
Anuj


------=_Part_574647_1333801493.1426561615198
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html><body><div style=3D"color:#000; background-color:#fff; font-family:ga=
ramond, new york, times, serif;font-size:12px"><div id=3D"yui_3_16_0_1_1426=
560788041_24971"><span></span></div><div id=3D"yui_3_16_0_1_1426560788041_2=
4972" class=3D"qtdSeparateBR"><br><br></div>  <div style=3D"display: block;=
" id=3D"yui_3_16_0_1_1426560788041_24977" class=3D"yahoo_quoted"><div id=3D=
"yui_3_16_0_1_1426560788041_24976" style=3D"font-family: garamond, new york=
, times, serif; font-size: 12px;"><div id=3D"yui_3_16_0_1_1426560788041_249=
75" style=3D"font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, =
Lucida Grande, Sans-Serif; font-size: 16px;"><div id=3D"yui_3_16_0_1_142656=
0788041_24986" class=3D"y_msg_container">Hi,<br><br>We are trying to Decoup=
le our Reporting DB from OLTP. Need urgent help on the feasibility of propo=
sed solution for PRODUCTION.<br><br>Use Case: Currently, our OLTP and Repor=
ting application and DB are same. Some CF are used for both OLTP and Report=
ing while others are solely used for Reporting.Every business transaction s=
ynchronously updates the main OLTP CF and asynchronously updates other Repo=
rting CFs.<br><br>Problem Statement:<br>1. Decouple Reporting and OLTP such=
 that Reporting load can't impact&nbsp; OLTP performance.<br>2. Scaling of =
Reporting&nbsp; and OLTP modules must be independent<br>3. OLTP client shou=
ld not update all Reporting CFs. We generate Data Records on File sytem/sha=
red disk.Reporting should use these Records to create Reporting DB.<br>4. S=
mall customers may do OLTP and Reporting on same 3-node cluster. Bigger cus=
tomers can be given an option to have dedicated OLTP and Reporting nodes. S=
o, standard Hardware box should be usable for 3 deployments (OLTP,Reporting=
 or OLTP+Reporting)<br><br>Note: Reporting is ad-hoc, may involve full tabl=
e scans and does not involve Analytics. Data size is huge 2TB (OLTP+Reporti=
ng) per node.<br><br>Hardware : Standard deployment -3 node cluster with ea=
ch node having 24 cores, 64GB RAM, 400GB * 6 SSDs in RAID5<br><br>Proposed =
Solution:<br>1. Split OLTP and Reporting clients into two application compo=
nents.<br>2. For small deployments where more than 3 nodes are not required=
:<br>&nbsp; &nbsp; A. Install 2 Cassandra instances on each node one for OL=
TP and other for Reporting<br>&nbsp; &nbsp; B. To distribute I/O load in 2:=
1 --Remove RAID5 (as Cassandra offers replication) and assign 4 disks as JB=
od for OLTP and 2 disks for Reporting<br>&nbsp; &nbsp; C. RAM is abundant a=
nd often under-utilized , so assign 8GB each for 2 Cassandra instance<br>&n=
bsp; &nbsp; D. To make sure that Reporting is not able to overload CPU, tun=
e concurrent_reads,concurrent_writes <br> OLTP client will only write to OL=
TP DB and generate DB record. Reporting client will poll FS and populate Re=
porting DB in required format.<br>3. Larger customers can have Reporting cl=
ients and DB on dedicated physical nodes with all resources.<br><br>Key Que=
stions:<br>Is it ok to run 2 Cassandra instances on one node in Production =
system and limit CPU Usage,Disk I/O and RAM as suggested above?<br>Any othe=
r solution for above mentioned problem statement?<br><br><br><br>Thanks<br>=
Anuj<br><br><br></div>  </div> </div>  </div> </div></body></html>
------=_Part_574647_1333801493.1426561615198--