cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8789) OutboundTcpConnectionPool should route messages to sockets by size not type
Date Thu, 23 Apr 2015 20:31:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509731#comment-14509731
] 

Ariel Weisberg commented on CASSANDRA-8789:
-------------------------------------------

[~mkjellman] I tried this reverting the socket change and initially I thought it mattered,
but I think I was swapping when it passed with the change reverted.

I tried it three times and they do the same thing. The first node OOMs and the heap dump blames
tasks sitting in SEPExecutor.

I also ran with flight recorder and checked the node serving client traffic and one of the
other nodes. There is some significant blocking on the coordinating node, but the longest
pause was 300 milliseconds and total duration was 2 seconds for a 1 minute period (200 pauses).
If I chased those down I bet they are correlated with GC pauses.

I was able to get 2.1.2 to write hints, but not to fail the same way that trunk does with
SEPExecutor OOM. Still digging into why trunk fares worse.

> OutboundTcpConnectionPool should route messages to sockets by size not type
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8789
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8789
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>            Assignee: Ariel Weisberg
>             Fix For: 3.0
>
>         Attachments: 8789.diff
>
>
> I was looking at this trying to understand what messages flow over which connection.
> For reads the request goes out over the command connection and the response comes back
over the ack connection.
> For writes the request goes out over the command connection and the response comes back
over the command connection.
> Reads get a dedicated socket for responses. Mutation commands and responses both travel
over the same socket along with read requests.
> Sockets are used uni-directional so there are actually four sockets in play and four
threads at each node (2 inbounded, 2 outbound).
> CASSANDRA-488 doesn't leave a record of what the impact of this change was. If someone
remembers what situations were made better it would be good to know.
> I am not clear on when/how this is helpful. The consumer side shouldn't be blocking so
the only head of line blocking issue is the time it takes to transfer data over the wire.
> If message size is the cause of blocking issues then the current design mixes small messages
and large messages on the same connection retaining the head of line blocking.
> Read requests share the same connection as write requests (which are large), and write
acknowledgments (which are small) share the same connections as write requests. The only winner
is read acknowledgements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message