cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: Run Mixed Workload using two instances on one node
Date Tue, 17 Mar 2015 05:36:06 GMT
I understand that 2 instances on one node looks a weird solution. But can have dedicated reporting
nodes for big customers but not for small customers. 

My questions would be:1. What is the technical reasoning? What problems you foresee  if we
use 2 C* instances on one node in production? We have ample HW on each server and mostly it's
under-utilized. We just want that heavy reporting must not impact OLTP and both OLTP and reporting
should be individually scalable.

2. I think we dont need Elastic Search. We just need a plain Reporting DB which can reply
to reporting queries.We can create our own CF as indexes. We dont need overhead of another
3PP for our current reporting needs.
ThanksAnuj





 


     On Tuesday, 17 March 2015 9:59 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:
   

 I don't think its recommended to have two instances on the same node.Have you considered
using something like elasticsearch for the reports? Its designed for that sort of thing.On
Mar 17, 2015 8:07 AM, "Anuj Wadehra" <anujw_2003@yahoo.co.in> wrote:



 Hi,

We are trying to Decouple our Reporting DB from OLTP. Need urgent help on the feasibility
of proposed solution for PRODUCTION.

Use Case: Currently, our OLTP and Reporting application and DB are same. Some CF are used
for both OLTP and Reporting while others are solely used for Reporting.Every business transaction
synchronously updates the main OLTP CF and asynchronously updates other Reporting CFs.

Problem Statement:
1. Decouple Reporting and OLTP such that Reporting load can't impact  OLTP performance.
2. Scaling of Reporting  and OLTP modules must be independent
3. OLTP client should not update all Reporting CFs. We generate Data Records on File sytem/shared
disk.Reporting should use these Records to create Reporting DB.
4. Small customers may do OLTP and Reporting on same 3-node cluster. Bigger customers can
be given an option to have dedicated OLTP and Reporting nodes. So, standard Hardware box should
be usable for 3 deployments (OLTP,Reporting or OLTP+Reporting)

Note: Reporting is ad-hoc, may involve full table scans and does not involve Analytics. Data
size is huge 2TB (OLTP+Reporting) per node.

Hardware : Standard deployment -3 node cluster with each node having 24 cores, 64GB RAM, 400GB
* 6 SSDs in RAID5

Proposed Solution:
1. Split OLTP and Reporting clients into two application components.
2. For small deployments where more than 3 nodes are not required:
    A. Install 2 Cassandra instances on each node one for OLTP and other for Reporting
    B. To distribute I/O load in 2:1 --Remove RAID5 (as Cassandra offers replication) and
assign 4 disks as JBod for OLTP and 2 disks for Reporting
    C. RAM is abundant and often under-utilized , so assign 8GB each for 2 Cassandra instance
    D. To make sure that Reporting is not able to overload CPU, tune concurrent_reads,concurrent_writes

 OLTP client will only write to OLTP DB and generate DB record. Reporting client will poll
FS and populate Reporting DB in required format.
3. Larger customers can have Reporting clients and DB on dedicated physical nodes with all
resources.

Key Questions:
Is it ok to run 2 Cassandra instances on one node in Production system and limit CPU Usage,Disk
I/O and RAM as suggested above?
Any other solution for above mentioned problem statement?



Thanks
Anuj


    


  
Mime
View raw message