Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1217F200B92 for ; Wed, 28 Sep 2016 15:18:50 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 10A19160AC1; Wed, 28 Sep 2016 13:18:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 86445160AB4 for ; Wed, 28 Sep 2016 15:18:48 +0200 (CEST) Received: (qmail 5672 invoked by uid 500); 28 Sep 2016 13:18:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 5661 invoked by uid 99); 28 Sep 2016 13:18:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Sep 2016 13:18:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 94E81C07C0 for ; Wed, 28 Sep 2016 13:18:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.899 X-Spam-Level: ** X-Spam-Status: No, score=2.899 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id kyS2jXeD93Zm for ; Wed, 28 Sep 2016 13:18:41 +0000 (UTC) Received: from mail-io0-f178.google.com (mail-io0-f178.google.com [209.85.223.178]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 45BC15FB55 for ; Wed, 28 Sep 2016 13:18:41 +0000 (UTC) Received: by mail-io0-f178.google.com with SMTP id r145so58432958ior.0 for ; Wed, 28 Sep 2016 06:18:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=TMz8BqVnmOukpp4q4Ijqwp518LBhS/iL0bDDCkZ8mKs=; b=L/EHwatrSVShbHGOdpVNAdsXtu2fiTGUarRryxx91iZDEsuJhVV9ldq3Zgo6/mUzSH ZnvMOewwbeBmoAo7xhoNlEP547pUHV05kBVxUL/hCna9d+o9K12HMOjt4oA3LgYnmmJi EV+0uIntj738Z3hgzNg4Jrq4GibnFqebZCI1KzgnixCG5CKvy24N8BYxTSVwzviD3xIg wiqzpvzRMr0pSC9bZVsLPYVrEGXtgWtbsY54Wl2hD990hclflrRtQguvPtMgVjacQdoG M2GaqIunptcDe4q1qwAk6Cgb46q8jgLX9QZVCf34FOggWQgJc+JcZT5rwVPTiC+oHIZt oa2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=TMz8BqVnmOukpp4q4Ijqwp518LBhS/iL0bDDCkZ8mKs=; b=kh2/S/vK3/threi5AAifQcvplhhK4tFRrFLwSpArMIHUL50UtYNf07JlIKOCNdygdg uKg+3xKhNFmCGIiBZu30Mcfd8qpch+ApPqhlw2j0ouNM/zybFhCTK2Wijo6fqVHLSDsN tUGiahTQE5jn1R5/9dGhc1BXExf1CJWPO+4agh276SZY46EXaFJgSo09dwNtKnxKVWBJ vK3+RVzhdZuPSk7ebwn05vevZrxk6xrMLlvblkzUVbNZqUcTiOcQpOkUJCZwfSgwNHp1 U8AvWXJy60aDvkrCwPuaXSihFqdHhBCoCeKVgPsFY8tRTY7qaLmisc86LTUYKrQUEo8U QjPw== X-Gm-Message-State: AE9vXwP2H8RjlSno5HBkzOIAJvUU63XBKQ65V0aNewUl3AC9eICY0ToiBaUXv/3v2YCNBtGR9TT37vXXg9LgOQ== X-Received: by 10.107.55.85 with SMTP id e82mr35161462ioa.14.1475068720104; Wed, 28 Sep 2016 06:18:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.250.138 with HTTP; Wed, 28 Sep 2016 06:18:39 -0700 (PDT) In-Reply-To: References: From: "techpyaasa ." Date: Wed, 28 Sep 2016 18:48:39 +0530 Message-ID: Subject: Re: nodetool rebuild streaming exception To: Alain RODRIGUEZ Cc: user@cassandra.apache.org, laxmikanth524@gmail.com Content-Type: multipart/alternative; boundary=001a114ab74e14cc81053d91331f archived-at: Wed, 28 Sep 2016 13:18:50 -0000 --001a114ab74e14cc81053d91331f Content-Type: text/plain; charset=UTF-8 @Alain That was one of my teammate , very sorry for it/multiple threads. *It looks like streams are failing right away when trying to rebuild.?* No , after partial streaming of data (around 150 GB - we have around 600 GB of data on each node) streaming is getting failed with the above exception stack trace. *It should be ran from DC3 servers, after altering keyspace to add keyspaces to the new datacenter. Is this the way you're doing it?* Yes, I'm running it from DC3 using " nodetool rebuild 'DC1' " command , after altering keyspace with RF : DC1:3 , DC2:3 , DC3:3 and we using Network Topology Strategy. Yes , all nodes are running on same c*-2.0.17 version. As I said , 'streaming_socket_timeout_in_ms: 86400000' to 24 hours. As suggested in @Paul & in some blogs , we gonna re-try with following changes *on new nodes in DC3.* *net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10* Hope these settings are enough on new nodes from where we are going to initiate rebuild/streaming and NOT required on all existing nodes from where we are getting data streamed. Am I right ?? Have to see whether it works :( and btw ,you can please through a light on this if you have faced such exception in past. As I mentioned in my last mail, this is the exception we are getting in streaming AFTER STREAMING some data. *java.io.IOException: Connection timed out* * at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* * at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* * at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* * at sun.nio.ch.IOUtil.write(IOUtil.java:65)* * at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* * at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)* * at java.lang.Thread.run(Thread.java:745)* * INFO [STREAM-OUT-/xxx.xxx.198.191] 2016-09-27 00:28:10,347 StreamResultFuture.java (line 186) [Stream #30852870-8472-11e6-b043-3f260c696828] Session with /xxx.xxx.198.191 is complete* *ERROR [STREAM-OUT-/xxx.xxx.198.191] 2016-09-27 00:28:10,347 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260c696828] Streaming error occurred* *java.io.IOException: Broken pipe* * at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* * at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* * at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* * at sun.nio.ch.IOUtil.write(IOUtil.java:65)* * at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* * at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* * at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)* * at java.lang.Thread.run(Thread.java:745)* *ERROR [STREAM-IN-/xxx.xxx.198.191] 2016-09-27 00:28:10,461 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260c696828] Streaming error occurred* *java.lang.RuntimeException: Outgoing stream handler has been closed* * at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)* * at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)* * at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* * at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* * at java.lang.Thread.run(Thread.java:745)* Thanks in advance techpyaasa On Wed, Sep 28, 2016 at 6:09 PM, Alain RODRIGUEZ wrote: > Just saw a very similar question from Laxmikanth (laxmikanth524@gmail.com) > on an other thread, with the same logs. > > Would you mind to avoid splitting multiple threads, to gather up > informations so we can better help you from this mailing list? > > C*heers, > > > 2016-09-28 14:28 GMT+02:00 Alain RODRIGUEZ : > >> Hi, >> >> It looks like streams are failing right away when trying to rebuild. >> >> >> - Could you please share with us the command you used? >> >> >> It should be ran from DC3 servers, after altering keyspace to add >> keyspaces to the new datacenter. Is this the way you're doing it? >> >> - Are all the nodes using the same version ('nodetool version')? >> - What does 'nodetool status keyspace_name1' output? >> - Are you sure to be using Network Topology Strategy on '*keyspace_name1'? >> *Have you modified this schema to add replications on DC3 >> >> My guess is something could be wrong with the configuration. >> >> I checked with our network operations team , they have confirmed network >>> is stable and no network hiccups. >>> I have set 'streaming_socket_timeout_in_ms: 86400000' (24 hours) as >>> suggested in datastax blog - https://support.datastax.com >>> /hc/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of >>> -streaming-errors-or-failures and ran 'nodetool rebuild' one node at a >>> time but was of NO USE . Still we are getting above exception. >>> >> >> This look correct to me, good you added this information, thanks. >> >> An other thought is I believe you need all the nodes to be up to have >> those streams working on the origin DC you use for your 'nodetool rebuild >> ' command. >> >> This look a bit weird, good luck. >> >> C*heers, >> ----------------------- >> Alain Rodriguez - @arodream - alain@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> 2016-09-27 18:54 GMT+02:00 techpyaasa . : >> >>> Hi, >>> >>> I'm trying to add new data center - DC3 to existing c*-2.0.17 cluster >>> with 2 data centers DC1, DC2 with replication DC1:3 , DC2:3 , DC3:3. >>> >>> I'm getting following exception repeatedly on new nodes after I run >>> 'nodetool rebuild'. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> *DEBUG [ScheduledTasks:1] 2016-09-27 04:24:00,416 GCInspector.java (line >>> 118) GC for ParNew: 20 ms for 1 collections, 9837479688 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:03,417 >>> GCInspector.java (line 118) GC for ParNew: 20 ms for 1 collections, >>> 9871193904 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:06,418 GCInspector.java (line 118) GC for ParNew: 20 ms for 1 >>> collections, 9950298136 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:09,419 GCInspector.java (line 118) GC for ParNew: 19 ms >>> for 1 collections, 9941119568 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:12,421 GCInspector.java (line 118) GC >>> for ParNew: 20 ms for 1 collections, 9864185024 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:15,422 >>> GCInspector.java (line 118) GC for ParNew: 60 ms for 2 collections, >>> 9730374352 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:18,423 GCInspector.java (line 118) GC for ParNew: 18 ms for 1 >>> collections, 9775448168 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:21,424 GCInspector.java (line 118) GC for ParNew: 22 ms >>> for 1 collections, 9850794272 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:24,425 GCInspector.java (line 118) GC >>> for ParNew: 20 ms for 1 collections, 9729992448 <9729992448> used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:27,426 >>> GCInspector.java (line 118) GC for ParNew: 22 ms for 1 collections, >>> 9699783920 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 >>> 04:24:30,427 GCInspector.java (line 118) GC for ParNew: 21 ms for 1 >>> collections, 9696523920 used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:33,429 GCInspector.java (line 118) GC for ParNew: 20 ms >>> for 1 collections, 9560497904 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:36,430 GCInspector.java (line 118) GC >>> for ParNew: 19 ms for 1 collections, 9568718352 <9568718352> used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:39,431 >>> GCInspector.java (line 118) GC for ParNew: 22 ms for 1 collections, >>> 9496991384 <9496991384> used; max is 16760438784DEBUG [ScheduledTasks:1] >>> 2016-09-27 04:24:42,432 GCInspector.java (line 118) GC for ParNew: 19 ms >>> for 1 collections, 9486433840 used; max is 16760438784DEBUG >>> [ScheduledTasks:1] 2016-09-27 04:24:45,434 GCInspector.java (line 118) GC >>> for ParNew: 19 ms for 1 collections, 9442642688 used; max is >>> 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:48,435 >>> GCInspector.java (line 118) GC for ParNew: 20 ms for 1 collections, >>> 9548532008 <9548532008> used; max is 16760438784DEBUG >>> [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,756 ConnectionHandler.java >>> (line 244) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File >>> (Header (cfId: bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0, version: jb, >>> estimated keys: 252928, transfer size: 5496759656, compressed?: true), >>> file: >>> /home/cassandra/data_directories/data/keyspace_name1/columnfamily_1/keyspace_name1-columnfamily_1-tmp-jb-54-Data.db)DEBUG >>> [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,757 ConnectionHandler.java >>> (line 310) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Sending Received >>> (bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0)ERROR >>> [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,759 StreamSession.java >>> (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.io.IOException: Connection timed out at >>> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at >>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at >>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at >>> sun.nio.ch.IOUtil.write(IOUtil.java:65) at >>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) at >>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311) >>> at java.lang.Thread.run(Thread.java:745)DEBUG [STREAM-OUT-/xxx.xxx.98.168] >>> 2016-09-27 04:24:49,764 ConnectionHandler.java (line 104) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on >>> /xxx.xxx.98.168 INFO [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >>> StreamResultFuture.java (line 186) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with /xxx.xxx.98.168 is >>> completeERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >>> StreamSession.java (line 461) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.io.IOException: Broken pipe at >>> sun.nio.ch.FileDispatcherImpl.write0(Native Method) at >>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) at >>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) at >>> sun.nio.ch.IOUtil.write(IOUtil.java:65) at >>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) at >>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319) >>> at java.lang.Thread.run(Thread.java:745)DEBUG [STREAM-IN-/xxx.xxx.98.168] >>> 2016-09-27 04:24:49,909 ConnectionHandler.java (line 244) [Stream >>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId: >>> 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys: >>> 4736, transfer size: 2306880, compressed?: true), file: >>> /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)ERROR >>> [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 StreamSession.java >>> (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error >>> occurredjava.lang.RuntimeException: Outgoing stream handler has been >>> closed at >>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126) >>> at >>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524) >>> at >>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413) >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245) >>> at java.lang.Thread.run(Thread.java:745)* >>> >>> >>> I checked with our network operations team , they have confirmed network >>> is stable and no network hiccups. >>> I have set 'streaming_socket_timeout_in_ms: 86400000' (24 hours) as >>> suggested in datastax blog - https://support.datastax.com/h >>> c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s >>> treaming-errors-or-failures and ran 'nodetool rebuild' one node at a >>> time but was of NO USE . Still we are getting above exception. >>> >>> Can someone please help me in debugging and fixing this. >>> >>> >>> Thanks, >>> techpyaasa >>> >>> >>> >>> >> > --001a114ab74e14cc81053d91331f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
@Alain
That was one of my teammate , very sorry for it/= multiple threads.

It looks like s= treams are failing right away when trying to rebuild.?
No , after pa= rtial streaming of data (around 150 GB - we have around 600 GB of data on e= ach node) streaming is getting failed with the above exception stack trace.=

It should be ran from DC3 serve= rs, after altering keyspace to add keyspaces to the new datacenter. Is this= the way you're doing it?
Yes, I'm running= it from DC3 using " nodetool rebuild 'DC1' " command =C2= =A0, after altering keyspace with RF : DC1:3 , DC2:3 , DC3:3 and we using= =C2=A0Network Topology Strategy.

Yes , all nodes are running on same c*-2.0.17 version.

<= /div>
As I said ,=C2=A0'streaming_socket_timeout_in_ms:= 86400000' to 24 hours.

As suggested in @Paul & in so= me blogs , we gonna re-try with following changes on new nodes in DC3.

net.ipv4.tcp_keepalive_time=3D60
net.ipv4.tcp_keepalive_pr= obes=3D3
net.ipv4.tcp_keepalive_intvl=3D10

Hope these settin= gs are enough on new nodes from where we are going to initiate rebuild/stre= aming and NOT required on all existing nodes from where we are getting data= streamed. Am I right ??

Have to see whether it works :( and btw ,yo= u can please through a light on this if you have faced such exception in pa= st.

As I mentioned in my last mail, this is the exception we are get= ting in streaming AFTER STREAMING some data.

java.io.IOExcep= tion: Connection timed out
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at = sun.nio.ch.FileDispatcherImpl.write0(Native Method)
=C2=A0= =C2=A0 =C2=A0 =C2=A0 at sun.nio.ch.SocketDispatcher.write(SocketDispatcher= .java:47)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at sun.nio.ch.IOUtil= .writeFromNativeBuffer(IOUtil.java:93)
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 at sun.nio.ch.IOUtil.write(IOUtil.java:65)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at sun.nio.ch.SocketChannelImpl.write(SocketChanne= lImpl.java:487)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.= cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.strea= ming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler= .java:339)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassa= ndra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandl= er.java:311)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thre= ad.run(Thread.java:745)
=C2=A0INFO [STREAM-OUT-/xxx.xxx.19= 8.191] 2016-09-27 00:28:10,347 StreamResultFuture.java (line 186) [Stream #= 30852870-8472-11e6-b043-3f260c696828] Session with /xxx.xxx.198.191 is comp= lete
ERROR [STREAM-OUT-/xxx.xxx.198.191] 2016-09-27 00:28:= 10,347 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260= c696828] Streaming error occurred
java.io.IOException: Bro= ken pipe
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at sun.nio.ch.FileDis= patcherImpl.write0(Native Method)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at sun.nio.ch.IOUtil.writeFromNativeBu= ffer(IOUtil.java:93)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at sun.ni= o.ch.IOUtil.write(IOUtil.java:65)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.streaming.m= essages.StreamMessage.serialize(StreamMessage.java:44)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.streaming.ConnectionHandle= r$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)
<= div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.streaming.Connec= tionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:74= 5)
ERROR [STREAM-IN-/xxx.xxx.198.191] 2016-09-27 00:28:10,= 461 StreamSession.java (line 461) [Stream #30852870-8472-11e6-b043-3f260c69= 6828] Streaming error occurred
java.lang.RuntimeException:= Outgoing stream handler has been closed
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(= ConnectionHandler.java:126)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at= org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:52= 4)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.str= eaming.StreamSession.messageReceived(StreamSession.java:413)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.cassandra.streaming.Connection= Handler$IncomingMessageHandler.run(ConnectionHandler.java:245)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Thread.java:745)

Thanks in advance
techpyaasa

On Wed, Sep 28, 2016 at 6:09 P= M, Alain RODRIGUEZ <arodrime@gmail.com> wrote:
Just saw a very similar question fro= m Laxmikanth (= laxmikanth524@gmail.com) on an other thread, with the same logs.
Would you mind to avoid splitting multiple threads, to gather up informati= ons so we can better help you from this mailing list?

C*= heers,


2016-09-28 14:28 GMT+02:00 Alain RODRIGUEZ <arodrime@gma= il.com>:
H= i,

It looks like streams are failing right away when try= ing to rebuild.

  • Could you please share wit= h us the command you used?

It should= be ran from DC3 servers, after altering keyspace to add keyspaces to the n= ew datacenter. Is this the way you're doing it?
  • Are a= ll the nodes using the same version ('nodetool version')?
  • W= hat does 'nodetool status=C2=A0keyspac= e_name1' output?
  • Are you sure to be using Network Topolo= gy Strategy on 'keyspace_name1'? = Have you modified this schema to add repli= cations on DC3
My gue= ss is something could be wrong with the configuration.

I checked with our network operations team , they= have confirmed network is stable and no network hiccups.
I have set 'streaming_socket_timeout_in_ms= : 86400000' (24 hours) as suggested in datastax blog=C2=A0 -=C2=A0https:/= /support.datastax.com/hc/en-us/articles/206502913-FAQ-How-to-redu= ce-the-impact-of-streaming-errors-or-failures=C2=A0and ran &#= 39;nodetool rebuild' one node at a time but was of NO USE . Still we ar= e getting above exception.

Thi= s look correct to me, good you added this information, thanks.
An other thought is I believe you need all the nodes to be up = to have those streams working on the origin DC you use for your 'nodeto= ol rebuild <origin_dc>' command.

This lo= ok a bit weird, good luck.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com<= /a>



2016-= 09-27 18:54 GMT+02:00 techpyaasa . <techpyaasa@gmail.com>= :
Hi,
=
I'm trying to add new data center - DC3 to existing c*-2.0.17= cluster with 2 data centers DC1, DC2 with replication DC1:3 , DC2:3 , DC3:= 3.

=C2=A0I'm getting following exception repeatedly on new nodes= after I run 'nodetool rebuild'.

DEBUG [ScheduledTasks:1] 2016-09-27 04:24:00,416 GCInspector.java (line 1= 18) GC for ParNew: 20 ms for 1 collections, 9837479688 used; max is 1676043= 8784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:03,417 GCInspector.java (= line 118) GC for ParNew: 20 ms for 1 collections, 9871193904 used; max is 1= 6760438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:06,418 GCInspector.= java (line 118) GC for ParNew: 20 ms for 1 collections, 9950298136 used; ma= x is 16760438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:09,419 GCInsp= ector.java (line 118) GC for ParNew: 19 ms for 1 collections, 9941119568 us= ed; max is 16760438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:12,421 = GCInspector.java (line 118) GC for ParNew: 20 ms for 1 collections, 9864185= 024 used; max is 16760438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:1= 5,422 GCInspector.java (line 118) GC for ParNew: 60 ms for 2 collections, 9= 730374352 used; max is 16760438784
DEBUG [ScheduledTasks:1] 2016-09-27 0= 4:24:18,423 GCInspector.java (line 118) GC for ParNew: 18 ms for 1 collecti= ons, 9775448168 used; max is 16760438784
DEBUG [ScheduledTasks:1] 2016-0= 9-27 04:24:21,424 GCInspector.java (line 118) GC for ParNew: 22 ms for 1 co= llections, 9850794272 used; max is 16760438784
DEBUG [ScheduledTasks:1] = 2016-09-27 04:24:24,425 GCInspector.java (line 118) GC for ParNew: 20 ms fo= r 1 collections, 9729992448 used; max is 16760438784
DEBUG [ScheduledTask= s:1] 2016-09-27 04:24:27,426 GCInspector.java (line 118) GC for ParNew: 22 = ms for 1 collections, 9699783920 used; max is 16760438784
DEBUG [Schedul= edTasks:1] 2016-09-27 04:24:30,427 GCInspector.java (line 118) GC for ParNe= w: 21 ms for 1 collections, 9696523920 used; max is 16760438784
DEBUG [S= cheduledTasks:1] 2016-09-27 04:24:33,429 GCInspector.java (line 118) GC for= ParNew: 20 ms for 1 collections, 9560497904 used; max is 16760438784
DE= BUG [ScheduledTasks:1] 2016-09-27 04:24:36,430 GCInspector.java (line 118) = GC for ParNew: 19 ms for 1 collections, 9568718352 used; max is 16760438784DEBUG [ScheduledTasks:1] 2016-09-27 04:24:39,431 GCInspector.java (line 11= 8) GC for ParNew: 22 ms for 1 collections, 9496991384 used; max is 16760438784=
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:42,432 GCInspector.java (line= 118) GC for ParNew: 19 ms for 1 collections, 9486433840 used; max is 16760= 438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:45,434 GCInspector.java= (line 118) GC for ParNew: 19 ms for 1 collections, 9442642688 used; max is= 16760438784
DEBUG [ScheduledTasks:1] 2016-09-27 04:24:48,435 GCInspecto= r.java (line 118) GC for ParNew: 20 ms for 1 collections, 9548532008 used; max= is 16760438784
DEBUG [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,75= 6 ConnectionHandler.java (line 244) [Stream #5e1b7f40-8496-11e6-8847-1b886<= wbr>65e430d] Received File (Header (cfId: bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0, version: jb, estimated keys: 252928, transfer size: 549675965= 6, compressed?: true), file: /home/cassandra/data_directories/data/key= space_name1/columnfamily_1/keyspace_name1-columnfamily_1-tmp-jb-5= 4-Data.db)
DEBUG [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,757 Co= nnectionHandler.java (line 310) [Stream #5e1b7f40-8496-11e6-8847-1b886= 65e430d] Sending Received (bf446a90-71c5-3552-a2e5-b1b94dbf86e3, #0)ERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,759 StreamSession.= java (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streami= ng error occurred
java.io.IOException: Connection timed out
= =C2=A0=C2=A0=C2=A0 at sun.nio.ch.FileDispatcherImpl.write0(Native Meth= od)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.SocketDispatcher.write(SocketD= ispatcher.java:47)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.IOUtil.writeFromNat<= wbr>iveBuffer(IOUtil.java:93)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.IOUtil.wr= ite(IOUtil.java:65)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.SocketChannelI= mpl.write(SocketChannelImpl.java:487)
=C2=A0=C2=A0=C2=A0 at or= g.apache.cassandra.streaming.messages.StreamMessage.serialize(Str= eamMessage.java:44)
=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.streaming= .ConnectionHandler$OutgoingMessageHandler.sendMessage(Connec= tionHandler.java:339)
=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.streami= ng.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)
=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.java:745)
DEBUG [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 Conn= ectionHandler.java (line 104) [Stream #5e1b7f40-8496-11e6-8847-1b88665= e430d] Closing stream connection handler on /xxx.xxx.98.168
=C2=A0INFO [= STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 StreamResultFuture.java= (line 186) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Session wit= h /xxx.xxx.98.168 is complete
ERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09= -27 04:24:49,764 StreamSession.java (line 461) [Stream #5e1b7f40-8496-11e6-= 8847-1b88665e430d] Streaming error occurred
java.io.IOException:= Broken pipe
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.SocketDispatche= r.write(SocketDispatcher.java:47)
=C2=A0=C2=A0=C2=A0 at sun.nio.ch.= IOUtil.writeFromNativeBuffer(IOUtil.java:93)
=C2=A0=C2=A0=C2=A0 at = sun.nio.ch.IOUtil.write(IOUtil.java:65)
=C2=A0=C2=A0=C2=A0 at sun.n= io.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
=C2= =A0=C2=A0=C2=A0 at org.apache.cassandra.streaming.messages.StreamMessa= ge.serialize(StreamMessage.java:44)
=C2=A0=C2=A0=C2=A0 at org.apach= e.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.se= ndMessage(ConnectionHandler.java:339)
=C2=A0=C2=A0=C2=A0 at org.apa= che.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.= run(ConnectionHandler.java:319)
=C2=A0=C2=A0=C2=A0 at java.lang.Thr= ead.run(Thread.java:745)
DEBUG [STREAM-IN-/xxx.xxx.98.168] 2016-09-= 27 04:24:49,909 ConnectionHandler.java (line 244) [Stream #5e1b7f40-8496-11= e6-8847-1b88665e430d] Received File (Header (cfId: 68af9ee0-96f8-3b1d-= a418-e5ae844f2cc2, #3, version: jb, estimated keys: 4736, transfer siz= e: 2306880, compressed?: true), file: /home/cassandra/data_directories= /data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_= metadata-tmp-jb-27-Data.db)
ERROR [STREAM-IN-/xxx.xxx.98.168] 2016-= 09-27 04:24:49,909 StreamSession.java (line 461) [Stream #5e1b7f40-8496-11e= 6-8847-1b88665e430d] Streaming error occurred
java.lang.RuntimeE= xception: Outgoing stream handler has been closed
=C2=A0=C2=A0=C2=A0= at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(= ConnectionHandler.java:126)
=C2=A0=C2=A0=C2=A0 at org.apache.cassandra.s= treaming.StreamSession.receive(StreamSession.java:524)
=C2=A0= =C2=A0=C2=A0 at org.apache.cassandra.streaming.StreamSession.messageRe= ceived(StreamSession.java:413)
=C2=A0=C2=A0=C2=A0 at org.apache.cas= sandra.streaming.ConnectionHandler$IncomingMessageHandler.run(Con= nectionHandler.java:245)
=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run= (Thread.java:745)



I checked with our network operations= team , they have confirmed=20 network is stable and no network hiccups.
I have set 'str= eaming_socket_timeout_in_ms: 86400000' (24 hours) as suggested in = datastax blog=C2=A0 - https://support.datastax.com/hc/en-us/articles/206= 502913-FAQ-How-to-reduce-the-impact-of-streaming-errors-or-failur= es and ran 'nodetool rebuild' one node at a time but was of NO = USE . Still we are getting above exception.

Can someone please help= me in debugging and fixing=20 this.


Thanks,
techpyaasa
=





--001a114ab74e14cc81053d91331f--