cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Sigletos <sigle...@textkernel.nl>
Subject Re: Error while rebuilding a node: Stream failed
Date Sat, 28 May 2016 17:05:34 GMT
No luck unfortunately. It seems that the connection to the destination node
was lost.

However there was progress compared to the previous times. A lot more data
was streamed.

(From source node)
INFO  [GossipTasks:1] 2016-05-28 17:53:57,155 Gossiper.java:1008 -
InetAddress /54.172.235.227 is now DOWN
INFO  [HANDSHAKE-/54.172.235.227] 2016-05-28 17:53:58,238
OutboundTcpConnection.java:487 - Handshaking version with /54.172.235.227
ERROR [STREAM-IN-/54.172.235.227] 2016-05-28 17:54:08,938
StreamSession.java:505 - [Stream #d25a05c0-241f-11e6-bb50-1b05ac77baf9]
Streaming error occurred
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79]
        at sun.nio.ch.SocketDispatcher.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.IOUtil.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketChannelImpl.read(Unknown Source) ~[na:1.7.0_79]
        at sun.nio.ch.SocketAdaptor$SocketInputStream.read(Unknown Source)
~[na:1.7.0_79]
        at sun.nio.ch.ChannelInputStream.read(Unknown Source) ~[na:1.7.0_79]
        at java.nio.channels.Channels$ReadableByteChannelImpl.read(Unknown
Source) ~[na:1.7.0_79]
        at
org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:51)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:257)
~[apache-cassandra-2.1.14.jar:2.1.14]
        at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
INFO  [SharedPool-Worker-1] 2016-05-28 17:54:59,612 Gossiper.java:993 -
InetAddress /54.172.235.227 is now UP

On Fri, May 27, 2016 at 5:37 PM, George Sigletos <sigletos@textkernel.nl>
wrote:

> I am trying once more using more aggressive tcp settings, as recommended
> here
> <https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>
>
> sudo sysctl -w net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10
>
> (added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all nodes)
>
> Let's see what happens. I don't know what else to try. I have even further
> increased streaming_socket_timeout_in_ms
>
>
>
> On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pauloricardomg@gmail.com>
> wrote:
>
>> I'm afraid raising streaming_socket_timeout_in_ms won't help much in this
>> case because the incoming connection on the source node is timing out on
>> the network layer, and streaming_socket_timeout_in_ms controls the socket
>> timeout in the app layer and throws SocketTimeoutException (not java.io.IOException:
>> Connection timed out). So you should probably use more aggressive tcp
>> keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
>> tuning that? Even that might not be sufficient as some routers tend to
>> ignore tcp keep-alives and just kill idle connections.
>>
>> As said before, this will ultimately be fixed by adding keep-alive to the
>> app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one
>> extreme approach would be to backport this to 2.1 (unless some experienced
>> operator out there has a more creative approach).
>>
>> @eevans, I'm not sure he is using a mixed version cluster, it seem he
>> finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.
>>
>> 2016-05-27 11:39 GMT-03:00 Eric Evans <john.eric.evans@gmail.com>:
>>
>>> From the various stacktraces in this thread, it's obvious you are
>>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>>> supported with mixed Cassandra versions.  Sometimes it will work,
>>> sometimes it won't (and it will definitely not work in this instance).
>>>
>>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>>> the new nodes using 2.1.13, and upgrade after.
>>>
>>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <sigletos@textkernel.nl>
>>> wrote:
>>>
>>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>>> >>>> StreamSession.java:505 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> >>>> Streaming error occurred
>>> >>>> java.lang.RuntimeException: Outgoing stream handler has been
closed
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>> >>>>
>>> >>>> And this is from the source node:
>>> >>>>
>>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>>> >>>> StreamSession.java:505 - [Stream
>>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>>> >>>> Streaming error occurred
>>> >>>> java.io.IOException: Broken pipe
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>>> Source)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>>> >>>> ~[na:1.7.0_79]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>> >>>>         at
>>> >>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>>
>>>
>>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140]
2016-05-24 22:44:57,704
>>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Remote peer 192.168.1.140 failed
stream session.
>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140]
2016-05-24 22:44:57,705
>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Streaming error occurred
>>> >>>>>>>>>>> java.io.IOException: Connection
timed out
>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>> Method)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown
Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown
Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown
Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown
Source) [na:1.7.0_79]
>>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140]
2016-05-24 22:44:58,625
>>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140]
2016-05-24 22:44:58,627
>>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Stream failed
>>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1]
2016-05-24
>>> 22:44:58,628
>>> >>>>>>>>>>> StorageService.java:1075 - Error
while rebuilding node
>>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException:
Stream failed
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown
Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140]
2016-05-24 22:44:58,629
>>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>>> >>>>>>>>>>> Streaming error occurred
>>> >>>>>>>>>>> java.io.IOException: Broken pipe
>>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native
>>> Method)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown
Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown
Source)
>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown
Source)
>>> >>>>>>>>>>> ~[na:1.7.0_79]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at
>>> >>>>>>>>>>>
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown
Source) [na:1.7.0_79]
>>>
>>>
>>>
>>> --
>>> Eric Evans
>>> john.eric.evans@gmail.com
>>>
>>
>>
>

Mime
View raw message