cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1019) "java.net.ConnectException: Connection timed out" in MESSAGE-STREAMING-POOL:1
Date Mon, 26 Apr 2010 16:15:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860975#action_12860975
] 

B. Todd Burruss commented on CASSANDRA-1019:
--------------------------------------------

some more info that may assist.  we have just purchased new machines for our test cluster
and we are having lots of trouble with the NICs going down.  this causes an extremely long
timeout situation and could have been the catalyst for this problem.

this situation does cause the cluster to behave very poorly because the connection takes several
minutes to timeout.  this type of situation makes me want the ability to manually take a node
out of the cluster and prevent nodes from gossiping to it.  is this something that has been
talked about?

> "java.net.ConnectException: Connection timed out" in MESSAGE-STREAMING-POOL:1
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1019
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1019
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6
>            Reporter: B. Todd Burruss
>            Assignee: Stu Hood
>             Fix For: 0.6.2
>
>
> after doing a nodetool repair on a node in my cluster, i see the following exception
on 4 out of the 7 nodes.  replication factor is 3.  no compactions happening.  no client traffic
to the cluster.  nodetool streams (on one of the nodes not repaired) shows the following which
is not ever increasing:
> Mode: Normal
> Streaming to: /192.168.132.117
>    /data/cassandra-data/data/UdsProfiles/stream/UdsProfiles-43-Data.db 0/523088443
> Not receiving any streams.
> in addition those same four nodes all show AE-SERVICE-STAGE with pending
> work, and been showing this for several hours now. each node in the
> cluster has less than 2gb, so it should be finished by now.
> here is the exception:
> 2010-04-23 10:08:43,416 ERROR [MESSAGE-STREAMING-POOL:1]
> [DebuggableThreadPoolExecutor.java:101] Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.net.ConnectException: Connection timed out
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection timed out
> at sun.nio.ch.Net.connect(Native Method)
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
> at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:60)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> ... 3 more
> 2010-04-23 10:08:43,417 ERROR [MESSAGE-STREAMING-POOL:1]
> [CassandraDaemon.java:78] Fatal exception in thread
> Thread[MESSAGE-STREAMING-POOL:1,5,main]
> java.lang.RuntimeException: java.net.ConnectException: Connection timed out
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.ConnectException: Connection timed out
> at sun.nio.ch.Net.connect(Native Method)
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:507)
> at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:60)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> ... 3 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message