cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Mon, 29 Jun 2015 13:21:05 GMT


Benedict commented on CASSANDRA-9318:

bq. default timeout is 2s not 10, so actually fine in your example of 300MB vs 150MB/s x 2s

Looks like 2.0 this was 10s, and it was hard-coded in yaml, so anyone upgrading from 2.0 or
before likely has a 10s timeout. So we should assume this is by far the most common timeout.

bq. you don't see a complete halt until capacity's worth of requests timeout all at once,
because you don't get an entire capacity load accepted at once. it's more continuous than
discrete – you pause until the oldest expire, accept more, pause until the oldest expire,
etc. so you make slow progress as load shedding can free up memory. thus, load shedding is
complementary to flow control.

You see a complete halt as soon as we exhaust space. If we exhaust space in < 0.5x timeout,
then we will see repeatedly juddering behaviour.

bq. but we can easily set a higher limit on MS heap – maybe as high as 1/8 heap as default
which gives us a lot of room for 8GB heap

If we set this really _aggressively_ high, say min(1/4 heap, 1Gb) until we implement the improved
shedding, then I'll quit complaining. Right now we give breathing room up to and beyond collapse.
 I absolutely agree that breathing room up until just-prior-to-collapse is preferable, but
cutting our breathing room by a magnitude is reducing our availability in clusters without
their opting into it. 1/4 heap is probably still leaving quite a lot of headroom we would
otherwise have safely used in a 2Gb heap (which are quite feasible, and probably preferable,
for many users running offheap memtables), but is still very unlikely to cause the server
to completely collapse. 

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>                 Key: CASSANDRA-9318
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 2.1.x, 2.2.x
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other

This message was sent by Atlassian JIRA

View raw message