kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Kreps (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-656) Add Quotas to Kafka
Date Tue, 26 Feb 2013 19:58:13 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587466#comment-13587466
] 

Jay Kreps commented on KAFKA-656:
---------------------------------

Yeah this a bit of a dilemma. Doing it cluster-wide with low latency is pretty hard. Arguably
the thing you want to protect is really the per-server load. That is to say the limit is that
we can't have one machine taking more than X messages/sec--though X might be fine if spread
over enough servers. However since in a sense the number of servers is something of an implementation
detail it makes it harder to express to the user what the speed limit is (after all if we
add more servers from their pov the speed limit just went up if it is a per-server number).
Maybe the sane way to do it is in terms of per-partition load rather than servers or topic
overall. Thoughts?
                
> Add Quotas to Kafka
> -------------------
>
>                 Key: KAFKA-656
>                 URL: https://issues.apache.org/jira/browse/KAFKA-656
>             Project: Kafka
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 0.8.1
>            Reporter: Jay Kreps
>              Labels: project
>
> It would be nice to implement a quota system in Kafka to improve our support for highly
multi-tenant usage. The goal of this system would be to prevent one naughty user from accidently
overloading the whole cluster.
> There are several quantities we would want to track:
> 1. Requests pers second
> 2. Bytes written per second
> 3. Bytes read per second
> There are two reasonable groupings we would want to aggregate and enforce these thresholds
at:
> 1. Topic level
> 2. Client level (e.g. by client id from the request)
> When a request hits one of these limits we will simply reject it with a QUOTA_EXCEEDED
exception.
> To avoid suddenly breaking things without warning, we should ideally support two thresholds:
a soft threshold at which we produce some kind of warning and a hard threshold at which we
give the error. The soft threshold could just be defined as 80% (or whatever) of the hard
threshold.
> There are nuances to getting this right. If you measure second-by-second a single burst
may exceed the threshold, so we need a sustained measurement over a period of time.
> Likewise when do we stop giving this error? To make this work right we likely need to
charge against the quota for request *attempts* not just successful requests. Otherwise a
client that is overloading the server will just flap on and off--i.e. we would disable them
for a period of time but when we re-enabled them they would likely still be abusing us.
> It would be good to a wiki design on how this would all work as a starting point for
discussion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message