I agree you can not really ask your database to capacity plan for you. Cassandra does have backpressure of sorts if  requests fail with TimedOutException or UnavailableException. You might be having a capacity problem.

The way I would handle this is
1) prototype at scale (dark launches, similar hardware loaded with data you expect in production)
2) collect stats like 95 percentile response time, request/failures.

When your 95 percentile starts dipping this is a good indication that it is time to deal with the performance issue.

On Wed, Feb 5, 2014 at 1:55 PM, Robert Coli <rcoli@eventbrite.com> wrote:
On Wed, Feb 5, 2014 at 6:14 AM, Ben Hood <0x6e6562@gmail.com> wrote:
What is the general approach to this from a server perspective? Is
there any flow control that the server can apply to back pressure onto
the sending driver?

No. In theory the client could look at dynamic snitch scores, I suppose, if the dynamic snitch worked right...

For most clients, my belief is the only backpressure is that, once a node is severely overloaded, it will stop attempting to write hints and return an OverloadedException. But this is only on the hint write path, not the normal write path.
If not, how do other driver implementors view this situation? Do you
try to maintain some kind of flow control at the driver level so that
you can push back onto the app, or you just let the effects of IO
saturation just bubble up to the app?

I think most deploys of Cassandra deal with this reality by carefully managing available capacity so that they don't risk getting in this situation.

I understand that is not a technical solution appropriate to your question's scope, but I do believe it describes the status quo.