cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: One or Two clusters?
Date Mon, 26 Mar 2012 17:47:39 GMT
Use one cluster. Use lots-o-machines.

The read and write paths do not directly  interfere with each other like they do in a RDBMS.
Compaction created by writes can suck up disk IO, but this is throttled so in practice it
is not such a big problem. Excessive GC created by reads or compaction may slow down the server,
but you will want to avoid them anyway.

The one caveat is: it depends on how you are transforming the data. If you have a are using
Hadoop consider creating a single cluster with multiple DC's (like Data Stax do). One for
OLTP and one for OLAP, do the hadoop work in the OLAP DC and have the online app read-write
to the OLTP one. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 27/03/2012, at 3:22 AM, Oleg Proudnikov wrote:

> Hi,
> 
> Could someone please help me understand the benefits of having a single large cluster
vs. having two smaller clusters separated by the pattern of use? One, MOSTLY WRITE cluster
could incrementally accumulate large amounts of data throughout the day. The daily increment
would be processed, summarized and stored into the second READ cluster at night. Users would
only need to interact with the READ portion of the overall system mostly during the day. Writes
would be spread throughout the day and will be a function of user activity with some bulk
load activity from time to time.  WRITE portion of the database would be an order of magnitude
larger than the READ portion. READ portion would have an an order of magnitude higher traffic
except during periodic bulk loads.
> 
> On one hand, If I were to have a single cluster I would have more  resources for the
users and potentially better scalability. A single cluster may need fewer servers overall,
provided write activity does not affect reads... On the other hand, write activity and associated
memory consumption, GC, as well as maintenance riutines may affect READ system. The system
will be hosted on EC2.
> 
> I would appreciate any thoughts.
> 
> Regards,
> Oleg


Mime
View raw message