cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10992) Hanging streaming sessions
Date Tue, 12 Jan 2016 21:32:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094989#comment-15094989
] 

Paulo Motta edited comment on CASSANDRA-10992 at 1/12/16 9:31 PM:
------------------------------------------------------------------

I don't know exactly what's happening, but the {{AsynchronousCloseException}} makes it smell
like the interrupt workaround for CASSANDRA-10012 is closing the channel after a genuine timeout,
preventing a retry. This was fixed on CASSANDRA-10961, so to test that hypothesis, could you
try replacing the jar I attached (which contains the 2.1 revert for CASSANDRA-10012) in all
nodes involved in a repair of a specific subrange? A rolling restart will be needed.  If this
does not solve the issue, please attach corresponding trace logs as instructed before (making
sure to enable trace logs in the logback configuration before triggering the faulty repair
operation after replacing the jars).


was (Author: pauloricardomg):
I don't know exactly what's happening, but the {{AsynchronousCloseException}} makes it smell
like the interrupt workaround for CASSANDRA-10012 is closing the channel after a genuine timeout,
preventing a retry. This was fixed on CASSANDRA-10961, so to test that hypothesis, could you
try replacing the jar I attached (which contains the 2.1 revert for CASSANDRA-10012) in a
subset of the nodes involved in the repair? A rolling restart will be needed.  If this does
not solve the issue, please attach corresponding trace logs as instructed before (making sure
to enable trace logs in the logback configuration before triggering the faulty repair operation
after replacing the jars).

> Hanging streaming sessions
> --------------------------
>
>                 Key: CASSANDRA-10992
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10992
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: C* 2.1.12, Debian Wheezy
>            Reporter: mlowicki
>            Assignee: Paulo Motta
>             Fix For: 2.1.12
>
>         Attachments: apache-cassandra-2.1.12-SNAPSHOT.jar
>
>
> I've started recently running repair using [Cassandra Reaper|https://github.com/spotify/cassandra-reaper]
 (built-in {{nodetool repair}} doesn't work for me - CASSANDRA-9935). It behaves fine but
I've noticed hanging streaming sessions:
> {code}
> root@db1:~# date
> Sat Jan  9 16:43:00 UTC 2016
> root@db1:~# nt netstats -H | grep total
>         Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB total
>         Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
>         Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB total
>         Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
>         Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB total
>         Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
>         Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB total
>         Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
>         Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB total
>         Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB total
> root@db1:~# date
> Sat Jan  9 17:45:42 UTC 2016
> root@db1:~# nt netstats -H | grep total
>         Receiving 5 files, 46.59 MB total. Already received 1 files, 11.32 MB total
>         Sending 7 files, 46.28 MB total. Already sent 7 files, 46.28 MB total
>         Receiving 6 files, 64.15 MB total. Already received 1 files, 12.14 MB total
>         Sending 5 files, 61.15 MB total. Already sent 5 files, 61.15 MB total
>         Receiving 4 files, 7.75 MB total. Already received 3 files, 7.58 MB total
>         Sending 4 files, 4.29 MB total. Already sent 4 files, 4.29 MB total
>         Receiving 12 files, 13.79 MB total. Already received 11 files, 7.66 MB total
>         Sending 5 files, 15.32 MB total. Already sent 5 files, 15.32 MB total
>         Receiving 8 files, 20.35 MB total. Already received 1 files, 13.63 MB total
>         Sending 38 files, 125.34 MB total. Already sent 38 files, 125.34 MB total
> {code}
> Such sessions are left even when repair job is long time done (confirmed by checking
Reaper's and Cassandra's logs). {{streaming_socket_timeout_in_ms}} in cassandra.yaml is set
to default value (3600000).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message