cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-685) add backpressure to StorageProxy
Date Wed, 30 Jun 2010 01:26:54 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883760#action_12883760
] 

Jonathan Ellis commented on CASSANDRA-685:
------------------------------------------

Following the line of reasoning from My comment on Jan 22, I think the best thing to do is
to take what we're doing now -- allowing TimedoutExceptions to serve as flow control -- but
make how we deal with overload situations better so we don't have the current potential for
a vicious cycle of getting farther and farther behind while RMS/RRS executors waste time processing
requests for which the coordinator node long since stopped waiting for:

- uncap RMS and RRS executors.  instead,
- MessageDeserializer will check recent RMS/RRS throughput and will simply discard requests
that won't make it through the task queue within RPCTimeout (preventing memory pressure from
huge task queue backlog, i have seen upwards of 1.5M pendingtasks on MD)
- MD will tag requests with a timestamp as they arrive and RMS/RRS will again discard requests
that have spent longer than RPCTimeout in the task queue
- log replay will have to self-throttle since RMS queue won't be doing it for it (it would
be nice to deal with this by adjusting the queue size but concurrent queue sizes are fixed
once created)

> add backpressure to StorageProxy
> --------------------------------
>
>                 Key: CASSANDRA-685
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-685
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.7
>
>         Attachments: 0001-impose-stage-queue-limit-of-2048-operations-which-shou.txt,
0002-make-TcpConnection.write-throw-WriteEnqueueException-i.txt
>
>
> Now that we have CASSANDRA-401 and CASSANDRA-488 there is one last piece: we need to
stop the target node from pulling mutations out of MessagingService as fast as it can only
to take up space in the mutation queue and eventually fill up memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message