cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Fri, 08 May 2015 17:21:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534997#comment-14534997
] 

Ariel Weisberg commented on CASSANDRA-9318:
-------------------------------------------

I looked at the issues you linked and didn't come away with something that looks like leaky
queues? Can you describe what that is? Is that shedding from the queues based on resources?
Makes sense to me mostly to prevent the initial overload at processing nodes until the cluster
can adapt to the disparity between requested capacity and actual capacity. If leaked items
resulted in an error response that would aid in feedback to the coordinator and free up resources
there.

Given the contract of CL=1 (or even quorum) you are right there is nothing to be gained by
bounding the number of in-flight requests at a coordinator by not reading requests from clients.
At CL=1 and the way I hear people think about availability in C* I think what you want is
to get better at failing to hinting before the coordinator or processing node overloads. Under
overload conditions CL=1 is basically synonymous with writing hints right?

bq.  which may leave us open to a multiplying effect of cluster overload, with each node dropping
different requests, possibly leading to only a tiny fraction of requests being serviced to
their required CL across the cluster. I'm not sure how we can best model this risk, or avoid
it without notifying coordinators of the drop of a message, and I don't see that being delivered
for 2.1
Maybe this is a congestion control problem? If we piggybacked information in responses on
congestion issues maybe we could make better decisions about new requests such as rejecting
a %age or going straight to hints before resources have been committed across the cluster?

Once something is hinted you can trickle out the load to match the actual capacity of the
thing being hinted. I know this conflicts with hints not being fast, but hints are just a
queue and could be very fast. I haven't looked at the work being done to hints that is in
progress.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other
issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message