cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Zhuang <>
Subject Re: Pluggable throttling of read and write queries
Date Thu, 23 Feb 2017 00:01:39 GMT
Here is the Scheduler interface:

Seems like it could be used for this case.

It is removed in 4.x with thrift, not sure why:


On 2/22/17 3:39 PM, Eric Stevens wrote:
>> We’ve actually had several customers where we’ve done the opposite -
> split large clusters apart to separate uses cases
> We do something similar but for a single application.  We're
> functionally sharding data to different clusters from a single
> application.  We can have different server classes for different types
> of workloads, we can grow and size clusters accordingly, and we also do
> things like time sharding so that we can let at-rest data go to cheaper
> storage options.
> I agree with the general sentiment here that (at least as it stands
> today) a monolithic cluster for many applications does not compete to
> per-application clusters unless cost is no issue.  At our scale, the
> terabytes of C* data we take in per day means that even very small cost
> savings really add up at scale.  And even where cost is no issue, the
> additional isolation and workload tailoring is still highly valuable.
> On Wed, Feb 22, 2017 at 12:01 PM Edward Capriolo <
> <>> wrote:
>     On Wed, Feb 22, 2017 at 1:20 PM, Abhishek Verma <
>     <>> wrote:
>         We have lots of dedicated Cassandra clusters for large use
>         cases, but we have a long tail of (~100) of internal customers
>         who want to store < 200GB of data with < 5k qps and non-critical
>         data. It does not make sense to create a 3 node dedicated
>         cluster for each of these small use cases. So we have a shared
>         cluster into which we onboard these users.
>         But once in a while, one of the customers will run a ingest job
>         from HDFS which will pound the shared cluster and break our SLA
>         for the cluster for all the other customers. Currently, I don't
>         see anyway to signal back pressure to the ingestion jobs or
>         throttle their requests. Another example is one customer doing a
>         large number of range queries which has the same effect.
>         A simple way to avoid this is to throttle the read or write
>         requests based on some quota limits for each keyspace or user.
>         Please see replies inlined:
>         On Mon, Feb 20, 2017 at 11:46 PM, vincent gromakowski
>         <
>         <>> wrote:
>             Aren't you using mesos Cassandra framework to manage your
>             multiple clusters ? (Seen a presentation in cass summit)
>         Yes we are
>         using and
>         contribute heavily to it. I am aware of the presentation
>         ( at the Cassandra
>         summit as I was the one who gave it :)
>         This has helped us automate the creation and management of these
>         clusters.
>             What's wrong with your current mesos approach ?
>         Hardware efficiency: Spinning up dedicated clusters for each use
>         case wastes a lot of hardware resources. One of the approaches
>         we have taken is spinning up multiple Cassandra nodes belonging
>         to different clusters on the same physical machine. However, we
>         still have overhead of managing these separate multi-tenant
>         clusters.
>             I am also thinking it's better to split a large cluster into
>             smallers except if you also manage client layer that query
>             cass and you can put some backpressure or rate limit in it.
>         We have an internal storage API layer that some of the clients
>         use, but there are many customers who use the vanilla DataStax
>         Java or Python driver. Implementing throttling in each of those
>         clients does not seem like a viable approach.
>             Le 21 févr. 2017 2:46 AM, "Edward Capriolo"
>             < <>> a écrit
>                 Older versions had a request scheduler api.
>         I am not aware of the history behind it. Can you please point me
>         to the JIRA tickets and/or why it was removed?
>                 On Monday, February 20, 2017, Ben Slater
>                 <> wrote:
>                     We’ve actually had several customers where we’ve
>                     done the opposite - split large clusters apart to
>                     separate uses cases. We found that this allowed us
>                     to better align hardware with use case requirements
>                     (for example using AWS c3.2xlarge for very hot data
>                     at low latency, m4.xlarge for more general purpose
>                     data) we can also tune JVM settings, etc to meet
>                     those uses cases.
>         There have been several instances where we have moved customers
>         out of the shared cluster to their own dedicated clusters
>         because they outgrew our limitations. But I don't think it makes
>         sense to move all the small use cases into their separate clusters.
>                     On Mon, 20 Feb 2017 at 22:21 Oleksandr Shulgin
>                     <> wrote:
>                         On Sat, Feb 18, 2017 at 3:12 AM, Abhishek Verma
>                         <> wrote:
>                             Cassandra is being used on a large scale at
>                             Uber. We usually create dedicated clusters
>                             for each of our internal use cases, however
>                             that is difficult to scale and manage.
>                             We are investigating the approach of using a
>                             single shared cluster with 100s of nodes and
>                             handle 10s to 100s of different use cases
>                             for different products in the same cluster.
>                             We can define different keyspaces for each
>                             of them, but that does not help in case of
>                             noisy neighbors.
>                             Does anybody in the community have similar
>                             large shared clusters and/or face noisy
>                             neighbor issues?
>                         Hi,
>                         We've never tried this approach and given my
>                         limited experience I would find this a terrible
>                         idea from the perspective of maintenance
>                         (remember the old saying about basket and eggs?)
>         What if you have a limited number of baskets and several eggs
>         which are not critical if they break rarely.
>                         What potential benefits do you see?
>         The main benefit of sharing a single cluster among several small
>         use cases is increasing the hardware efficiency and decreasing
>         the management overhead of a large number of clusters.
>         Thanks everyone for your replies and questions.
>         -Abhishek.
>     I agree with these assertions. On one hand I think about a "managed
>     service" like say Amazon DynamoDB. They likely start with
>     very/very/very large footprints. IE they commission huge clusters of
>     the fastest SSD hardware. Next every application/user has a quota.
>     They always can control the basic load because they control the quota.
>     Control on the hardware level makes sense, but then your unit of
>     management is "a cluster" . Users do not have a unified API anymore,
>     they have switch statements, this data in cluster x, this data
>     cluster y. You still end up in cases where degenerate usage patterns
>     affect others.
>     With Cassandra it would be nice if these controls were build into
>     the API. This could also help you build your own charge back model
>     in the enterprise. Sure as someone pointed out rejecting reads
>     stinks for that user. But then again someone has to decide who and
>     how pays for the hardware.
>     For example, imagine a company with 19 business units all using the
>     same Cassandra cluster. One business unit might account for 90% of
>     the storage, but 1% of the requests. Another business unit might be
>     95% of the requests, but 1% the data. How do you come up with a
>     billing model? For the customer with 95% of the requests their
>     "cost" on the systems is young generation GC, network.
>     Datastax enterprise had/has a concept of  "the analytic dc". The
>     concept is "real time goes here" and "analytic goes there" with the
>     right resource controls you could get much more fine grained then
>     that. It will never be perfect there will always be that random
>     abuser with the "aggregate allow filtering query" but there are ways
>     to move in a more managed direction.

View raw message