cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Fri, 08 May 2015 10:15:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534260#comment-14534260
] 

Benedict commented on CASSANDRA-9318:
-------------------------------------

bq. In my mind in-flight means that until the response is sent back to the client the request
counts against the in-flight limit.
bq. We already keep the original request around until we either get acks from all replicas,
or they time out (and we write a hint).

Perhaps we need to clarify more clearly what this ticket is proposing. I understood it to
mean what Ariel commented here, whereas it sounds like Jonathan is suggesting we simply prune
our ExpiringMap based on bytes tracked as well as time?

The ExpiringMap requests are already "in-flight" and cannot be cancelled, so their effect
on other nodes cannot be rescinded, and imposing a limit does not stop us issuing more requests
to the nodes in the cluster that are failing to keep up and respond to us. It _might_ be sensible
if introducing more aggressive shedding at processing nodes to also shed the response handlers
more aggressively locally, but I'm not convinced it would have a significant impact on cluster
health by itself; cluster instability spreads out from problematic nodes, and this scheme
still permits us to inundate those nodes with requests they cannot keep up with.

Alternatively, the approach of forbidding new requests if you have items in the ExpiringMap
causes the collapse of other nodes to spread throughout the cluster, as rapidly (especially
on small clusters) all requests on the system are destined for the collapsing node, and every
coordinator stops accepting requests. The system seizes up, and that node is still failing
since it's got requests from the entire cluster queued up with it. In general on the coordinator
there's no way of distinguishing between a failed node, network partition, or just struggling,
so we don't know if we should wait.

Some mix of the two might be possible, if we were to wait while a node is just slow, then
drop our response handlers for the node if it's marked as down. This latter may not be a bad
thing to do anyway, but I would not want to depend on this behaviour to maintain our precious
"A"

It still seems the simplest and most robust solution is to make our work queues leaky, since
this insulates the processing nodes from cluster-wide inundation, which the coordinator approach
cannot (even with the loss of "A" and cessation of processing, there is a whole cluster vs
a potentially single node; with hash partition it doesn't take long for all processing to
begin involving a single failing node). We can do this just on the number of requests and
still be much better than we are currently. 

We could also pair this with coordinator-level dropping of handlers for "down" nodes, and
above a threshold. This latter, though, could result in widespread uncoordinated dropping
of requests, which may leave us open to a multiplying effect of cluster overload, with each
node dropping different requests, possibly leading to only a tiny fraction of requests being
serviced to their required CL across the cluster. I'm not sure how we can best model this
risk, or avoid it without notifying coordinators of the drop of a message, and I don't see
that being delivered for 2.1

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other
issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message