cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananthkumar K S (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6772) Cassandra inter data center communication broken
Date Tue, 18 Mar 2014 14:51:48 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939305#comment-13939305
] 

Ananthkumar K S edited comment on CASSANDRA-6772 at 3/18/14 2:50 PM:
---------------------------------------------------------------------

No [~jbellis]. The application layer was dropping the connection. I have already explained
very clearly that network level communication was still happening( at TCP level).Request you
to reconsider the bug for a case of private link failure at ISP level.


was (Author: crack2drop):
No [~jbellis]. The application layer was dropping the connection. I have already explained
very clearly that network level communication was still happening at TCP level. Request you
to reconsider the bug for a case of private link failure at ISP level.

> Cassandra inter data center communication broken
> ------------------------------------------------
>
>                 Key: CASSANDRA-6772
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6772
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: CentOS 6.0
>            Reporter: Ananthkumar K S
>            Priority: Blocker
>
> I have two data enters DC1 and DC2. Both communicate via a private link. Yesterday, we
had a problem with a private link for 10 mins. From the time the problem was resolved, nodes
in both data centers are not able to communicate with each other. When I do a nodetool status
on a node in DC1, the nodes in DC2 are stated as down. When tried in DC2, nodes in DC1 are
shown as down .
> But in the cassandra logs, we can clearly see that handshaking is failing every 5 seconds
for communication between data centres. At TCP level, there are too many fin_wait1 generated
by cassandra which is still a puzzle . Closed_wait top transitions due to this is very high.
Due to this kind of problem of TCP listen drops, we moved from 2.0.1 to 2.0.3. In 2.0.1, it
was within data center itself. But here it's between data centers. If it has anything to do
with the snitch configuration, I am using GossipingPropertyFileSnitch.
> This clearly started happening post private link failure. Any idea on this?
> Cassandra version used is 2.0.3



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message