Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 476BB200C32 for ; Thu, 23 Feb 2017 00:39:46 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 45F92160B72; Wed, 22 Feb 2017 23:39:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DA216160B62 for ; Thu, 23 Feb 2017 00:39:44 +0100 (CET) Received: (qmail 36936 invoked by uid 500); 22 Feb 2017 23:39:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 36926 invoked by uid 99); 22 Feb 2017 23:39:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Feb 2017 23:39:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C134FC193A for ; Wed, 22 Feb 2017 23:39:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.38 X-Spam-Level: ** X-Spam-Status: No, score=2.38 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id TEzDEjtslvLc for ; Wed, 22 Feb 2017 23:39:39 +0000 (UTC) Received: from mail-qt0-f171.google.com (mail-qt0-f171.google.com [209.85.216.171]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id B8E415F30F for ; Wed, 22 Feb 2017 23:39:38 +0000 (UTC) Received: by mail-qt0-f171.google.com with SMTP id n21so16365625qta.1 for ; Wed, 22 Feb 2017 15:39:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=PHCsp5KIt8rXuUkG74OqUzr/PFQ/ovvFE8eUNCHggxM=; b=tSzdv6KrPFeWajfjir4HAQN3nPJzOlX2OctaO93epaVpjLFAWZZbMlzJXmEk2ZVbmH nESmsTEFtNcPhVH2UKUFr04a17HYGIUFGVeS5hSQKcPGUPELdpLLkXm9zMGaRGJQvmm/ 3byVBHFAgI+0KtJ8DoMK6Hzd7Cy6b5HDx4LgVLuCCZmJGbWms5famGkYuM/H3ZW2ECwj 7Xr96afiGn4rgsYO2j3YqtfcbMn4fSCsP0hXZQ7ysRPvtXrzh6MP92q30sWr7fxlpIVZ Gbzk7/pg5xoPw0fXafbgXGze3PIoKdB36NYLQY+84vGHfXsorQCOYQWvVxVmeIEtYU04 kYgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=PHCsp5KIt8rXuUkG74OqUzr/PFQ/ovvFE8eUNCHggxM=; b=taaD/F7IsHynVZnV54c1f8AfBaEcdzV3uIm2aKMcxXuEfugxnXaMujp3uVQirI/TCl AErNbXX03JQAcGP01lP8RsbtMIERsTObFGvi3KwBqd6iI3uuM2z31Na582AgnkYCtZ9S kPuh4A4yutnfGbOuPf66sEAHRMvoFb80qzKDquahVtVg9G2VlYueAavgv6kTrFFxeoji NbjLFj7zNQtD6opnaXWiFG4U+Fba4p7XAfKjJeBlyH+AaT/W0cjYMeXadJLX/0z4IeN/ 5n39BlE7seT3GxUM1UctaHkREIxqcMnTPCeMEK/fsWo+4EgmxTyVxpSVhTFasdwbl2B4 2iBQ== X-Gm-Message-State: AMke39mVExpEoadCjYva/7zGzmvtQqiVyJ+s7JWTeWW9zc+xssUCVlMXNSkq4FtCWES/6Q3HkC0i/8FWtbFtcA== X-Received: by 10.237.49.1 with SMTP id 1mr19968557qtg.114.1487806769969; Wed, 22 Feb 2017 15:39:29 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Eric Stevens Date: Wed, 22 Feb 2017 23:39:19 +0000 Message-ID: Subject: Re: Pluggable throttling of read and write queries To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=94eb2c0cb01804c2d40549270227 archived-at: Wed, 22 Feb 2017 23:39:46 -0000 --94eb2c0cb01804c2d40549270227 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > We=E2=80=99ve actually had several customers where we=E2=80=99ve done the= opposite - split large clusters apart to separate uses cases We do something similar but for a single application. We're functionally sharding data to different clusters from a single application. We can have different server classes for different types of workloads, we can grow and size clusters accordingly, and we also do things like time sharding so that we can let at-rest data go to cheaper storage options. I agree with the general sentiment here that (at least as it stands today) a monolithic cluster for many applications does not compete to per-application clusters unless cost is no issue. At our scale, the terabytes of C* data we take in per day means that even very small cost savings really add up at scale. And even where cost is no issue, the additional isolation and workload tailoring is still highly valuable. On Wed, Feb 22, 2017 at 12:01 PM Edward Capriolo wrote: > > > On Wed, Feb 22, 2017 at 1:20 PM, Abhishek Verma wrote: > > We have lots of dedicated Cassandra clusters for large use cases, but we > have a long tail of (~100) of internal customers who want to store < 200G= B > of data with < 5k qps and non-critical data. It does not make sense to > create a 3 node dedicated cluster for each of these small use cases. So w= e > have a shared cluster into which we onboard these users. > > But once in a while, one of the customers will run a ingest job from HDFS > which will pound the shared cluster and break our SLA for the cluster for > all the other customers. Currently, I don't see anyway to signal back > pressure to the ingestion jobs or throttle their requests. Another exampl= e > is one customer doing a large number of range queries which has the same > effect. > > A simple way to avoid this is to throttle the read or write requests base= d > on some quota limits for each keyspace or user. > > Please see replies inlined: > > On Mon, Feb 20, 2017 at 11:46 PM, vincent gromakowski < > vincent.gromakowski@gmail.com> wrote: > > Aren't you using mesos Cassandra framework to manage your multiple > clusters ? (Seen a presentation in cass summit) > > Yes we are using https://github.com/mesosphere/dcos-cassandra-service and > contribute heavily to it. I am aware of the presentation ( > https://www.youtube.com/watch?v=3D4Ap-1VT2ChU) at the Cassandra summit as= I > was the one who gave it :) > This has helped us automate the creation and management of these clusters= . > > What's wrong with your current mesos approach ? > > Hardware efficiency: Spinning up dedicated clusters for each use case > wastes a lot of hardware resources. One of the approaches we have taken i= s > spinning up multiple Cassandra nodes belonging to different clusters on t= he > same physical machine. However, we still have overhead of managing these > separate multi-tenant clusters. > > I am also thinking it's better to split a large cluster into smallers > except if you also manage client layer that query cass and you can put so= me > backpressure or rate limit in it. > > We have an internal storage API layer that some of the clients use, but > there are many customers who use the vanilla DataStax Java or Python > driver. Implementing throttling in each of those clients does not seem li= ke > a viable approach. > > Le 21 f=C3=A9vr. 2017 2:46 AM, "Edward Capriolo" = a > =C3=A9crit : > > Older versions had a request scheduler api. > > I am not aware of the history behind it. Can you please point me to the > JIRA tickets and/or why it was removed? > > On Monday, February 20, 2017, Ben Slater > wrote: > > We=E2=80=99ve actually had several customers where we=E2=80=99ve done the= opposite - split > large clusters apart to separate uses cases. We found that this allowed u= s > to better align hardware with use case requirements (for example using AW= S > c3.2xlarge for very hot data at low latency, m4.xlarge for more general > purpose data) we can also tune JVM settings, etc to meet those uses cases= . > > There have been several instances where we have moved customers out of th= e > shared cluster to their own dedicated clusters because they outgrew our > limitations. But I don't think it makes sense to move all the small use > cases into their separate clusters. > > On Mon, 20 Feb 2017 at 22:21 Oleksandr Shulgin < > oleksandr.shulgin@zalando.de> wrote: > > On Sat, Feb 18, 2017 at 3:12 AM, Abhishek Verma wrote: > > Cassandra is being used on a large scale at Uber. We usually create > dedicated clusters for each of our internal use cases, however that is > difficult to scale and manage. > > We are investigating the approach of using a single shared cluster with > 100s of nodes and handle 10s to 100s of different use cases for different > products in the same cluster. We can define different keyspaces for each = of > them, but that does not help in case of noisy neighbors. > > Does anybody in the community have similar large shared clusters and/or > face noisy neighbor issues? > > > Hi, > > We've never tried this approach and given my limited experience I would > find this a terrible idea from the perspective of maintenance (remember t= he > old saying about basket and eggs?) > > What if you have a limited number of baskets and several eggs which are > not critical if they break rarely. > > > What potential benefits do you see? > > The main benefit of sharing a single cluster among several small use case= s > is increasing the hardware efficiency and decreasing the management > overhead of a large number of clusters. > > Thanks everyone for your replies and questions. > > -Abhishek. > > > I agree with these assertions. On one hand I think about a "managed > service" like say Amazon DynamoDB. They likely start with very/very/very > large footprints. IE they commission huge clusters of the fastest SSD > hardware. Next every application/user has a quota. They always can contro= l > the basic load because they control the quota. > > Control on the hardware level makes sense, but then your unit of > management is "a cluster" . Users do not have a unified API anymore, they > have switch statements, this data in cluster x, this data cluster y. You > still end up in cases where degenerate usage patterns affect others. > > With Cassandra it would be nice if these controls were build into the API= . > This could also help you build your own charge back model in the > enterprise. Sure as someone pointed out rejecting reads stinks for that > user. But then again someone has to decide who and how pays for the > hardware. > > For example, imagine a company with 19 business units all using the same > Cassandra cluster. One business unit might account for 90% of the storage= , > but 1% of the requests. Another business unit might be 95% of the request= s, > but 1% the data. How do you come up with a billing model? For the custome= r > with 95% of the requests their "cost" on the systems is young generation > GC, network. > > Datastax enterprise had/has a concept of "the analytic dc". The concept > is "real time goes here" and "analytic goes there" with the right resourc= e > controls you could get much more fine grained then that. It will never be > perfect there will always be that random abuser with the "aggregate allow > filtering query" but there are ways to move in a more managed direction. > --94eb2c0cb01804c2d40549270227 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
>=C2=A0We=E2=80=99v= e actually had several customers where we=E2=80=99ve done the opposite - sp= lit large clusters apart to separate uses cases

We do something similar but for a single application.=C2=A0 We're fu= nctionally sharding data to different clusters from a single application.= =C2=A0 We can have different server classes for different types of workload= s, we can grow and size clusters accordingly, and we also do things like ti= me sharding so that we can let at-rest data go to cheaper storage options.<= /span>

= I agree with the general sentiment here= that (at least as it stands today) a monolithic cluster for many applicati= ons does not compete to per-application clusters unless cost is no issue.= =C2=A0 At our scale, the terabytes of C* data we take in per day means that= even very small cost savings really add up at scale.=C2=A0 And even where = cost is no issue, the additional isolation and workload tailoring is still = highly valuable.

On Wed, Feb 22, 2017 at 12:01 PM Edward Capriolo <edlinuxguru@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">


On Wed, Feb 22, 2017 at 1:20 PM, Abh= ishek Verma <verma@uber.com> wrote:
We have lot= s of dedicated Cassandra clusters for large use cases, but we have a long t= ail of (~100) of internal customers who want to store < 200GB of data wi= th < 5k qps and non-critical data. It does not make sense to create a 3 = node dedicated cluster for each of these small use cases. So we have a shar= ed cluster into which we onboard these users.
=

But once in a while, one of the customers will run a ingest job from H= DFS which will pound the shared cluster and break our SLA for the cluster f= or all the other customers. Currently, I don't see anyway to signal bac= k pressure to the ingestion jobs or throttle their requests. Another exampl= e is one customer doing a large number of range queries which has the same = effect.

A simple way to avoid this is to throttle the read or wri= te requests based on some quota limits for each keyspace or user.

Please see replies inline= d:

On Mon, Feb 20, 2017 at 11:46 PM, vincent gr= omakowski <vincent.gro= makowski@gmail.com> wrote:

Aren't you using mesos Cassandra framework to manage your mult= iple clusters ? (Seen a presentation in cass summit)

Yes we are using=C2=A0https://github.com/mesosphere/dcos-cassandra-service = and contribute heavily to it. I am aware of the presentation (https://www.youtube.com/watch?v=3D4Ap-1VT2ChU) at the Cassandra= summit as I was the one who gave it :)
This = has helped us automate the creation and management of these clusters.
=

What's wrong with your current mesos approach ?
=

Hardware efficiency: Spinning up = dedicated clusters for each use case wastes a lot of hardware resources. On= e of the approaches we have taken is spinning up multiple Cassandra nodes b= elonging to different clusters on the same physical machine. However, we st= ill have overhead of managing these separate multi-tenant clusters.

I am also thinking it's better to split a large cluster into smallers e= xcept if you also manage client layer that query cass and you can put some = backpressure or rate limit in it.

= We have an internal storage API layer that some of the clients use, but the= re are many customers who use the vanilla DataStax Java or Python driver. I= mplementing throttling in each of those clients does not seem like a viable= approach.

L= e=C2=A021 f=C3=A9vr. 2017 2:46 AM, "Edward Capriolo" <edl= inuxguru@gmail.com> a =C3=A9crit=C2=A0:
Older versions had a request scheduler api.
=
I am not aware of the history b= ehind it. Can you please point me to the JIRA tickets and/or why it was rem= oved?=C2=A0

O= n Monday, February 20, 2017, Ben Slater <ben.slat= er@instaclustr.com> wrote:
We=E2=80=99ve actually had several customers where we=E2=80=99ve done = the opposite - split large clusters apart to separate uses cases. We found = that this allowed us to better align hardware with use case requirements (f= or example using AWS c3.2xlarge for very hot data at low latency, m4.xlarge= for more general purpose data) we can also tune JVM settings, etc to meet = those uses cases.
There have been several instances= where we have moved customers out of the shared cluster to their own dedic= ated clusters because they outgrew our limitations. But I don't think i= t makes sense to move all the small use cases into their separate clusters.=

On Mon, 20 Feb 2017 at 22:21 Oleksandr Shu= lgin <oleksandr.shulgin@zalando.de> wrote:=
On Sat, Feb 18, 2017 at 3:12 AM, Abhishek Verma <verma@uber.com= > wrote:
Cassandra i= s being used on a large scale at Uber. We usually create dedicated clusters= for each of our internal use cases, however that is difficult to scale and= manage.

We are investigating the approach of using a single shared c= luster with 100s of nodes and handle 10s to 100s of different use cases for= different products in the same cluster. We can define different keyspaces = for each of them, but that does not help in case of noisy neighbors.=C2=A0<= /div>

Does anybody in the community have similar large shared clusters = and/or face noisy neighbor issues?

Hi,

We've never tr= ied this approach and given my limited experience I would find this a terri= ble idea from the perspective of maintenance (remember the old saying about= basket and eggs?)
<= /div>
What if you have a limited number of baskets and several eggs which ar= e not critical if they break rarely.
=C2=A0
Wh= at potential benefits do you see?
=
=
The main benefit of sharing a s= ingle cluster among several small use cases is increasing the hardware effi= ciency and decreasing the management overhead of a large number of clusters= .

Thanks everyone for your replies and questions.
<= div class=3D"gmail_msg">
-Abhishek.

I agree with these as= sertions. On one hand I think about a "managed service" like say = Amazon DynamoDB. They likely start with very/very/very large footprints. IE= they commission huge clusters of the fastest SSD hardware. Next every appl= ication/user has a quota. They always can control the basic load because th= ey control the quota.=C2=A0

Control on the hardware level makes sense, but then your unit of= management is "a cluster" . Users do not have a unified API anym= ore, they have switch statements, this data in cluster x, this data cluster= y. You still end up in cases where degenerate usage patterns affect others= .

With Cassandra = it would be nice if these controls were build into the API. This could also= help you build your own charge back model in the enterprise. Sure as someo= ne pointed out rejecting reads stinks for that user. But then again someone= has to decide who and how pays for the hardware.=C2=A0

For example, imagine a company with 19 business units all us= ing the same Cassandra cluster. One business unit might account for 90% of = the storage, but 1% of the requests. Another business unit might be 95% of = the requests, but 1% the data. How do you come up with a billing model? For= the customer with 95% of the requests their "cost" on the system= s is young generation GC, network.=C2=A0

= Datastax enterprise had/has a concept of =C2=A0"the analytic dc".= The concept is "real time goes here" and "analytic goes the= re" with the right resource controls you could get much more fine grai= ned then that. It will never be perfect there will always be that random ab= user with the "aggregate allow filtering query" but there are way= s to move in a more managed direction.
--94eb2c0cb01804c2d40549270227--