cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Thu, 14 Jul 2016 13:56:20 GMT


Ariel Weisberg commented on CASSANDRA-9318:

There really isn't much memory to play with when deciding when to backpressure. There are
128 requests threads and once those are all consumed by a slow node, which doesn't take long
in a small cluster, things stall completely. If things were async you might be able to commit
enough memory that requests time out before you need to stall. In other words you can shed
via timeouts to nodes and no additional mechanisms are needed.

Not reading from clients doesn't address the issue. You have still created a situation in
which nodes that are performing well can't make progress because they can no longer read requests
from clients because of one slow node. Not reading from clients is the current implementation.

Hinting as it works now doesn't address the issue because the slow node may never actually
catch up or become faster. Waiting for every request that is going to time out to time out
and be hinted is going to restrict the coordinators ability to coordinate. Hinting also doesn't
work because there are only 128 concurrent requests that can be in the process of being hinted
see paragraph #1.

If the coordinator wants to continue to make progress it has to read requests from clients
and then quickly know if it should shed them. We could shed them silently in which case the
upstream client is going to time out and it's going to exhaust it's memory or threadpool and
we have silently and unfixably moved the problem upstream. I suppose clients can try and implement
their own health metrics to duplicate the work we are doing at the coordinator, but it still
can't force the coordinator to shed so the client can replaced those requests that won't succeed
with ones that will.

Or we can signal that we aren't going to do that request at this time and the client can engage
whatever mitigation strategy it wants to implement. There is a whole separate discussion about
what the state of the art needs to be in client drivers to do something useful with this information
and how to expose the mechanism and policy choices to applications.

Rate limiting isn't really useful. You just end up with all the request threads stuck in the
rate limiter and coordinators continue to not make progress. Rate limiting doesn't solve a
load issue at the remote end because as I've demonstrated the remote end can buffer up enough
requests until shedding kicks in due to the timeout and reduces memory utilization to something
the heap can handle.

If things were async what would rate limiting look like? Would it be disabling read for clients?
How is the coordinator going to make progress then if it can't coordinate requests for healthy

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>                 Key: CASSANDRA-9318
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, limit.btm,
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other

This message was sent by Atlassian JIRA

View raw message