cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Sun, 28 Jun 2015 12:09:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604649#comment-14604649
] 

Jonathan Ellis edited comment on CASSANDRA-9318 at 6/28/15 12:08 PM:
---------------------------------------------------------------------

Here's where I've ended up:

# Continuing to accept writes faster than a coordinator can deliver them to replicas is bad.
 Even perfect load shedding is worse from a client perspective than throttling, since if we
load shed and time out the client needs to try to guess the "right" rate to retry at.
# For the same reason, accepting a write but then refusing it with UnavailableException is
worse than waiting to accept the write until we have capacity for it.
# It's more important to throttle writes because while we can get in trouble with large reads
too (a small request turns into a big reply), in practice reads are naturally throttled because
a client needs to wait for the read before taking action on it.  With writes on the other
hand a new user's first inclination is to see how fast s/he can bulk load stuff.

In practice, I see load shedding and throttling as complementary.  Replicas can continue to
rely on load shedding.  Perhaps we can attempt distributed back pressure later (if every replica
is overloaded, we should again throttle clients) but for now let's narrow our scope to throttling
clients to the capacity of a coordinator to send out.

*I propose we define a limit on the amount of memory MessagingService can consume and pause
reading additional requests whenever that limit is hit.*  Note that:

# If MS's load is distributed evenly across all destinations then this is trivially the right
thing to do.
# If MS's load is caused by a single replica falling over or unable to keep up, this is still
the right thing to do because the alternative is worse.  MS will load shed timed out requests,
but if clients are sending more requests to a single replica than we can shed (if rate * timeout
> capacity) then we still need to throttle or we will exhaust the heap and fall over. 


(The hint-based UnavailableException tries to help with scenario 2, and I will open a ticket
to test how well that actually works.  But the hint threshold cannot help with scenario 1
at all and that is the hole this ticket needs to plug.)


was (Author: jbellis):
Here's where I've ended up:

# Continuing to accept writes faster than a coordinator can deliver them to replicas is bad.
 Even perfect load shedding is worse from a client perspective than throttling, since if we
load shed and time out the client needs to try to guess the "right" rate to retry at.
# For the same reason, accepting a write but then refusing it with UnavailableException is
worse than waiting to accept the write until we have capacity for it.
# It's more important to throttle writes because while we can get in trouble with large reads
too (a small request turns into a big reply), in practice reads are naturally throttled because
a client needs to wait for the read before taking action on it.  With writes on the other
hand a new user's first inclination is to see how fast s/he can bulk load stuff.

In practice, I see load shedding and throttling as complementary.  Replicas can continue to
rely on load shedding.  Perhaps we can attempt distributed back pressure later (if every replica
is overloaded, we should again throttle clients) but for now let's narrow our scope to throttling
clients to the capacity of a coordinator to send out.

I propose we define a limit on the amount of memory MessagingService can consume and pause
reading additional requests whenever that limit is hit.  Note that:

# If MS's load is distributed evenly across all destinations then this is trivially the right
thing to do.
# If MS's load is caused by a single replica falling over or unable to keep up, this is still
the right thing to do because the alternative is worse.  MS will load shed timed out requests,
but if clients are sending more requests to a single replica than we can shed (if rate * timeout
> capacity) then we still need to throttle or we will exhaust the heap and fall over. 


(The hint-based UnavailableException tries to help with scenario 2, and I will open a ticket
to test how well that actually works.  But the hint threshold cannot help with scenario 1
at all and that is the hole this ticket needs to plug.)

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.2.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other
issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message