cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Shook (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Sun, 10 May 2015 00:27:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536846#comment-14536846
] 

Jonathan Shook edited comment on CASSANDRA-9318 at 5/10/15 12:26 AM:
---------------------------------------------------------------------

I would venture that a solid load shedding system may improve the degenerate overloading case,
but it is not the preferred method for dealing with overloading for most users. The concept
of back-pressure is more squarely what people expect, for better or worse.

Here is what I think reasonable users want to see, with some variations:
1) The system performs with stability, up to the workload that it is able to handle with stability.
2a) Once it reaches that limit, it starts pushing back in terms of how quickly it accepts
new work. This means that it simply blocks the operations or submissions of new requests with
some useful bound that is determined by the system. It does not yet have to shed load. It
does not yet have to give exceptions. This is a very reasonable expectation for most users.
This is what they expect. Load shedding is a term of art which does not change the users'
expectations.
2b) Once it reaches that limit, it starts throwing OE to the client. It does not have to shed
load yet. (Perhaps this exception or something like it can be thrown _before_ load shedding
occurs.) This is a very reasonable expectation for users who are savvy enough to do active
load management at the client level. It may have to start writing hints, but if you are writing
hints merely because of load, this might not be the best justification for having the hints
system kick in. To me this is inherently a convenient remedy for the wrong problem, even if
it works well. Yes, hints are there as a general mechanism, but it does not solve the problem
of needing to know when the system is being pushed beyond capacity and how to handle it proactively.
You could also say that hints actively hurt capacity when you need them most sometimes. They
are expensive to process given the current implementation, and will always be "load shifting"
even at theoretical best. Still we need them for node availability concerns, although we should
be careful not to use them as a crutch for general capacity issues.
2c) Once it reaches that limit, it starts backlogging (without a helpful signature of such
in the responses, maybe BackloggingException with some queue estimate). This is a very reasonable
expectation for users who are savvy enough to manage their peak and valley workloads in a
sensible way. Sometimes you actually want to tax the ingest and flush side of the system for
a bit before allowing it to switch modes and catch up with compaction. The fact that C* can
do this is an interesting capability, but those who want backpressure will not easily see
it that way.
2d) If the system is being pushed beyond its capacity, then it may have to shed load. This
should only happen if the user has decided that they want to be responsible for such and have
pushed the system beyond the reasonable limit without paying attention to the indications
in 2a, 2b, and 2c. In the current system, this decision is already made for them. They have
no choice.

In a more optimistic world, users would get near optimal performance for a well tuned workload
with back-pressure active throughout the system, or something very much like it. We could
call it a different kind of scheduler, different queue management methods, or whatever. 
As long as the user could prioritize stability at some bounded load over possible instability
at an over-saturating load, I think they would in most cases. Like I said, they really don't
have this choice right now. I know this is not trivial. We can't remove the need to make sane
judgments about sizing and configuration. We might be able to, however, make the system ramp
more predictably up to saturation, and behave more reasonable at that level.

Order of precedence, How to designate a mode of operation, or any other concerns aren't really
addressed here. I just provided the examples above as types of behaviors which are nuanced
yet perfectly valid for different types of system designs. The real point here is that there
is not a single overall QoS/capacity/back-pressure behavior which is going to be acceptable
to all users. Still, we need to ensure stability under saturating load where possible. I would
like to think that with CASSANDRA-8099 that we can start discussing some of the client-facing
back-pressure ideas more earnestly. I do believe that these ideas are all compatible ideas
on a spectrum of behavior. They are not mutually exclusive from a design/implementation perspective.
It's possible that they could be specified per operation, even, with some traffic yield to
others due to client policies. For example, a lower priority client could yield when it knows
the cluster is approaching saturation (Responses could contain a % loading level estimate),
while higher priority data stream could keep writing data as long as the backlogging queue
level was less than a certain amount. ( perhaps a score which factors in the time delay to
the oldest planned but uncompacted data.. )

We can come up with methods to improve the reliable and responsive capacity of the system
even with some internal load management. If the first cut ends up being sub-optimal, then
we can measure it against non-bounded workload tests and strive to close the gap. If it is
implemented in a way that can support multiple usage scenarios, as described above, then such
a limitation might be "unlimited", "bounded at level ___", or "bounded by inline resource
management".. But in any case would be controllable by some users/admin, client.. If we could
ultimately give the categories of users above the ability to enable the various modes, then
the 2a) scenario would be perfectly desirable for many users already even if the back-pressure
logic only gave you 70% of the effective system capacity. Once testing shows that performance
with active back-pressure to the client is close enough to the unbounded workloads, it could
be enabled by default.

Summary: We still need reasonable back-pressure support throughout the system and eventually
to the client. Features like this that can be a stepping stone towards such are still needed.
The most perfect load shedding and hinting systems will still not be a sufficient replacement
for back-pressure and capacity management.

I know this comment contains lots of tangents to the original ticket. As well, it doesn't
speak specifically to the implementation details or ideas directly. If we should take this
comment and move it to another ticket, let me know. I thought the emphasis towards back-pressure
mechanisms was appropriate, but it did get a bit wordy.



was (Author: jshook):
I would venture that a solid load shedding system may improve the degenerate overloading case,
but it is not the preferred method for dealing with overloading for most users. The concept
of back-pressure is more squarely what people expect, for better or worse.

Here is what I think reasonable users want to see, with some variations:
1) The system performs with stability, up to the workload that it is able to handle with stability.
2a) Once it reaches that limit, it starts pushing back in terms of how quickly it accepts
new work. This means that it simply blocks the operations or submissions of new requests with
some useful bound that is determined by the system. It does not yet have to shed load. It
does not yet have to give exceptions. This is a very reasonable expectation for most users.
This is what they expect. Load shedding is a term of art which does not change the users expectations.
2b) Once it reaches that limit, it starts throwing OE to the client. It does not have to shed
load yet. This is a very reasonable expectation for users who are savvy enough to do active
load management at the client level. It may have to start writing hints, but if you are writing
hints because of load, this might not be the best justification for having the hints system
kick in. To me this is inherently a convenient remedy for the wrong problem, even if it works
well. Yes, hints are there as a general mechanism, but it does not relieve us of the problem
of needing to know when the system is at capacity and how to handle it proactively. You could
also say that hints actively hurt capacity when you need them most sometimes. They are expensive
to process given the current implementation, and will always be "load shifting" even at theoretical
best. Still we need them for node availability concerns, although we should be careful to
use them as a crutch for general capacity issues.
2c) Once it reaches that limit, it starts backlogging (without a helpful signature of such
in the responses, maybe BackloggingException with some queue estimate). This is a very reasonable
expectation for users who are savvy enough to manage their peak and valley workloads in a
sensible way. Sometimes you actually want to tax the ingest and flush side of the system for
a bit before allowing it to switch modes and catch up with compaction. The fact that C* can
do this is an interesting capability, but those who want backpressure will not easily see
it that way.
2d) If the system is being pushed beyond its capacity, then it may have to shed load. This
should only happen if the users has decided that they want to be responsible for such and
have pushed the system beyond the reasonable limit without paying attention to the indications
in 2a, 2b, and 2c.

Order of precedence, designated mode of operation, or any other concerns aren't really addressed
here. I just provided the examples above as types of behaviors which are nuanced yet perfectly
valid for different types of system designs. The real point here is that there is not a single
overall QoS/capacity/back-pressure behavior which is going to be acceptable to all users.
Still, we need to ensure stability under saturating load where possible. I would like to think
that with CASSANDRA-8099 that we can start discussing some of the client-facing back-pressure
ideas more earnestly.

We can come up with methods to improve the reliable and responsive capacity of the system
even with some internal load management. If the first cut ends up being sub-optimal, then
we can measure it against non-bounded workload tests and strive to close the gap. If it is
implemented in a way that can support multiple usage scenarios, as described above, then such
a limitation might be "unlimited", "bounded at level ___", or "bounded by inline resource
management".. But in any case would be controllable by some users/admin, client.. If we could
ultimately give the categories of users above the ability to enable the various modes, then
the 2a) scenario would be perfectly desirable for many users already even if the back-pressure
logic only gave you 70% of the effective system capacity. Once testing shows that performance
with active back-pressure to the client is close enough to the unbounded workloads, it could
be enabled by default.

Summary: We still need reasonable back-pressure support throughout the system and eventually
to the client. Features like this that can be a stepping stone towards such are still needed.
The most perfect load shedding and hinting systems will still not be a sufficient replacement
for back-pressure and capacity management.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other
issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message