cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: Cluster per Application vs. Multi-Application Clusters
Date Wed, 22 Aug 2012 20:05:56 GMT
True, all in one cluster is very comparable to putting your application on
amazon's cloud. When you have lots of apps, you can benefit from a batch
job at night using resources that are not used by your day job apps.
Always tradeoffs of course as if both apps go off at the same time҆.well,
you get the picture.


On 8/22/12 1:30 PM, "Edward Capriolo" <> wrote:

>If you are staring out small one logical/physical cluster is probably
>the best and only approach.
>Long term this is very case by case dependent but I generally believe
>Cluster per Application is the best approach. Although I consider it
>"Cluster per QOS"
>For our use cases I find that two applications have very different
>data sizes and quality of service requirements. For example, one
>application may have a small dataset size and a high repeated read/
>cache hit rate scenario. While another application may have a large
>sparse dataset and a "random read pattern". Also one application may
>demand fast < 3 ms reads while the other may find 10 or 20 ms reads
>When those two applications are placed on the same set of hardware you
>end up scaling them both even though at a given time only one or the
>other needs to be scaled. In extreme cases application 1 and 2 cause
>contention and make each other unhappy.
>What is best to do is architect your systems in such a way that moving
>an individual column family to a new set of hardware is not difficult.
>This might involve something map reduce program that can bulk load
>existing data between two clusters, while your front end application
>can send the write/updates/deletes to both the old an the new cluster.
>Also make sure your application does not have too many hard coded
>touch points that assume a single cluster.
>As you mentioned one thing gained from keeping everything in the same
>keyspace is connection pooling. However unlike a RDBMS world where
>coordinated transactions have to happen in order, etc, etc that is not
>the case with C* so getting all data into the same physical "system"
>is not as important.
>On Wed, Aug 22, 2012 at 8:25 AM, Hiller, Dean <>
>> Just an opinion here as we are having to do this ourselves loading tons
>>of researchers datasets into one clusters.  We are going the path of one
>>keyspace as it makes it easier if you ever want to mine the data so you
>>don't have to keep building different clients for another keyspace.  We
>>ended up adding our own security layer as well so researchers can expose
>>their datasets to other researchers and once exposed, other researchers
>>can join that data with their existing data.
>> This of course is just one use case, but if 10 applications use
>>cassandra, you still may find a benefit in having an 11th data mining
>>app look at the data from all 10 apps.
>> Later,
>> Dean
>> playOrm Developer
>> From: Ersin Er <<>>
>> Reply-To: "<>"
>> Date: Wednesday, August 22, 2012 12:44 AM
>> To: "<>"
>> Subject: Cluster per Application vs. Multi-Application Clusters
>> Hi all,
>> What are the advantages of allocating a cluster for a single
>>application vs running multiple applications on the same cassandra
>>cluster? Is any of the models suggested over the other?
>> Thanks.
>> --
>> Ersin Er

View raw message