cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergio Bossa (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator
Date Tue, 12 Jul 2016 20:18:20 GMT


Sergio Bossa commented on CASSANDRA-9318:


bq. if a message expires before it is sent, we consider this negatively for that replica,
since we increment the outgoing rate but not the incoming rate when the callback expires,
and still it may have nothing to do with the replica if the message was not sent, it may be
due to the coordinator dealing with too many messages.

Right, but there isn't much we can do without way more invasive changes. Anyway, I don't think
that's actually a problem, as if the coordinator is overloaded we'll end up generating too
many hints and fail with {{OverloadedException}} (this time with its original meaning), so
we should be covered.

bq. I also observe that if a replica has a low rate, then we may block when acquiring the
limiter, and this will indirectly throttle for all following replicas, even if they were ready
to receive mutations sooner.

See my answer at the end.

bq. AbstractWriteResponseHandler sets the start time in the constructor, so the time spent
acquiring a rate limiter for slow replicas counts towards the total time before the coordinator
throws a write timeout exception.

See my answer at the end.

bq. SP.sendToHintedEndpoints(), we should apply backpressure only if the destination is alive.

I know, I'm holding on these changes until we settle on a plan for the whole write path (in
terms of what to do with CL, the exception to be thrown etc.).

bq. Let's use UnavailableException since WriteFailureException indicates a non-timeout failure
when processing a mutation, and so it is not appropriate for this case. For protocol V4 we
cannot change UnavailableException, but for V5 we should add a new parameter to it. At the
moment it contains <cl><required><alive>, we should add the number of overloaded
replicas, so that drivers can treat the
two cases differently.

Does it mean we should advance the protocol version in this issue, or delegate to a new issue?

bq. Marking messages as throttled would let the replica know if backpressure was enabled,
that's true, but it also makes the existing mechanism even more complex.

How so? In implementation terms, it should be literally as easy as:
1) Add a byte parameter to {{MessageOut}}.
2) Read such byte parameter from {{MessageIn}} and eventually skip dropping it replica-side.
3) If possible (didn't check it), when a "late" response is received on the coordinator, try
to cancel the related hint.

Do you see any complexity I'm missing there?

bq. dropping mutations that have been in the queue for longer that the RPC write timeout is
done not only to shed load on the replica, but also to avoid wasting resources to perform
a mutation when the coordinator has already returned a timeout exception to the client.

This is very true and that's why I said it's a bit of a wild idea. Obviously, that is true
outside of back-pressure, as even now it is possible to return a write timeout to clients
and still have some or all mutations applied. In the end, it might be good to optionally enable
such behaviour, as the advantage would be increased consistency at the expense of more resource
consumption, which is a tradeoff some users might want to make, but to be clear, I'm not strictly
lobbying to implement it, just trying to reason about pros and cons.

bq. I still have concerns regarding additional write timeout exceptions and whether an overloaded
or slow replica can slow everything down.

These are valid concerns of course, and given similar concerns from [~jbellis], I'm working
on some changes to avoid write timeouts due to healthy replicas unnaturally throttled by unhealthy
ones, and depending on [~jbellis] answer to my last comment above, maybe only actually back-pressure
if the CL is not met.

Stay tuned.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>                 Key: CASSANDRA-9318
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, limit.btm,
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding
the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests
and if it reaches a high watermark disable read on client connections until it goes back below
some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other

This message was sent by Atlassian JIRA

View raw message