Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9F09E200BDC for ; Wed, 30 Nov 2016 05:27:31 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9DAD1160B23; Wed, 30 Nov 2016 04:27:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D4EE1160B15 for ; Wed, 30 Nov 2016 05:27:29 +0100 (CET) Received: (qmail 29906 invoked by uid 500); 30 Nov 2016 04:27:28 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 29873 invoked by uid 99); 30 Nov 2016 04:27:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Nov 2016 04:27:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C9482C13E4 for ; Wed, 30 Nov 2016 04:27:26 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 6.176 X-Spam-Level: ****** X-Spam-Status: No, score=6.176 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, HTML_MESSAGE=2, KAM_BADIPHTTP=2, NML_ADSP_CUSTOM_MED=1.2, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.972, URIBL_BLOCKED=0.001, WEIRD_PORT=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id cS-dvfKEHrQ8 for ; Wed, 30 Nov 2016 04:27:22 +0000 (UTC) Received: from mbob.nabble.com (mbob.nabble.com [162.253.133.15]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 349BA5F5F9 for ; Wed, 30 Nov 2016 04:27:22 +0000 (UTC) Received: from static.162.255.23.37.macminivault.com (unknown [162.255.23.37]) by mbob.nabble.com (Postfix) with ESMTP id 9661F36F9169 for ; Tue, 29 Nov 2016 20:15:47 -0800 (PST) Date: Tue, 29 Nov 2016 21:27:21 -0700 (MST) From: piali To: user@ignite.apache.org Message-ID: Subject: Unable to create cluster of Apache Ignite Server Containers running on individual VMs MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_61301_132253402.1480480041874" archived-at: Wed, 30 Nov 2016 04:27:31 -0000 ------=_Part_61301_132253402.1480480041874 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I am trying to create a cluster of apache ignite server containers but unable to bring it up. *Setup:* To start, I have created two VMs on two separate host machines and trying to launch one Apache Ignite server container (Docker) on each VMs . The VMs are accessible using floating IP (e.g., VM1-172.26.116.67, VM2-172.26.116.150) and containers are using host networking. The containers are also pinging each other. *Testing: * I am using the $IGNITE_HOME/bin/ignite.sh, but have changed the default configuration to enable discovery. ** ** *127.0.0.1:47100..47509* * 172.26.116.67:47100..47509* ** *Issue:* When I start the 1st Apache Ignite server container on VM1, I see warnings related to remote node GC pauses even though I tuned off heap memory (), and also, no remote node is runni= ng. (verified through visor) [22:55:24,132][INFO][main][TcpCommunicationSpi] Successfully bound to TCP port [port=3D47100, locHost=3D0.0.0.0/0.0.0.0] [22:55:24,854][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=3D47500, localHost=3D0.0.0.0/0.0.0.0] *[22:55:26,379][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D50]* *[22:55:26,482][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D100]* *[22:55:26,684][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D200]* *[22:55:27,086][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D400]* *[22:55:27,888][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D800]* *[22:55:29,491][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D1600]* *[22:55:32,696][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D3200]* *[22:55:39,098][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D6400]* *[22:55:51,904][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D12800]* *[22:56:17,523][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D25600]* [22:56:18,033][WARNING][main][GridCacheProcessor] *Eviction policy not enabled with ONHEAP_TIERED mode for cache* (entries will not be moved to off-heap store): default [22:56:18,120][SEVERE][grid-nio-worker-1-#38%null%][GridDirectParser] *Fail= ed to read message* [msg=3Dnull, buf=3Djava.nio.DirectByteBuffer[pos=3D5 lim= =3D420 cap=3D32768], reader=3Dnull, ses=3DGridSelectorNioSessionImpl [sele ctorIdx=3D1, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer[pos=3D0 li= m=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim=3D420 cap=3D3= 2768], recovery=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/127.0.0.1:47 100, rmtAddr=3D/127.0.0.1:48819, createTime=3D1480460126370, closeTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480460178019, lastSndTime=3D1480460178114, lastRcvTime=3D1480460178114, readsPaused=3Dfal= se, filterChai n=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do.a.i.i.util.nio.GridDirectParser@330f5ec2, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]] =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6 [22:56:18,127][WARNING][grid-nio-worker-1-#38%null%][TcpCommunicationSpi] *Failed to process selector key (will close):* GridSelectorNioSessionImpl [selectorIdx=3D1, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer [pos=3D0 lim=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos= =3D5 lim=3D420 cap=3D32768], recovery=3Dnull, super=3DGridNioSessionImpl [locAdd= r=3D/ 127.0.0.1:47100, rmtAddr=3D/127.0.0.1:48819, createTime=3D1480460126370, c loseTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480460178019, lastSndTime=3D1480460178114, lastRcvTime=3D1480460178114, readsPaused=3Dfal= se, filterChain=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do.a.i.i. util.nio.GridDirectParser@330f5ec2, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]] [22:56:18,127][SEVERE][grid-nio-worker-1-#38%null%][TcpCommunicationSpi] Closing NIO session because of unhandled exception. =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6.. After few seconds this node starts with the below logs [22:56:18,573][INFO][exchange-worker-#47%null%][GridCachePartitionExchangeM= anager] Skipping rebalancing (nothing scheduled) [top=3DAffinityTopologyVersion [topVer=3D1, minorTopVer=3D0], evt=3DNODE_JOINED, node=3D6a09c54 d-9cbd-42f8-bdb9-522dc438ce1c] =E2=80=A6=E2=80=A6=E2=80=A6 [22:56:18,633][INFO][main][GridDiscoveryManager] *Topology snapshot [ver=3D= 1, servers=3D1, clients=3D0, CPUs=3D2, heap=3D1.0GB]* [22:56:28,053][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Your version is up to date. As soon as I start the 2nd Apache Ignite server container on VM2, I get the below logs although I have increased the default network timeout () [23:18:10,313][INFO][main][TcpCommunicationSpi] Successfully bound to TCP port [port=3D47100, locHost=3D0.0.0.0/0.0.0.0] [23:18:10,946][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=3D47500, localHost=3D0.0.0.0/0.0.0.0] *[23:18:12,410][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D50]* *[23:18:12,512][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D100]* *[23:18:12,715][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D200]* *[23:18:13,117][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D400]* *[23:18:13,919][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D800]* *[23:18:15,523][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D1600]* *[23:18:18,728][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D3200]* *[23:18:25,136][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D6400]* *[23:18:37,942][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D12800]* *[23:19:03,569][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3D25600]* =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6. [23:19:55,021][SEVERE][grid-nio-worker-1-#38%null%][GridDirectParser] *Fail= ed to read message* [msg=3Dnull, buf=3Djava.nio.DirectByteBuffer[pos=3D5 lim= =3D420 cap=3D32768], reader=3Dnull, ses=3DGridSelectorNioSessionImpl [sele ctorIdx=3D1, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer[pos=3D0 li= m=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim=3D420 cap=3D3= 2768], recovery=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/172.20.29.33 :47100, rmtAddr=3D/172.26.116.67:41952, createTime=3D1480461492404, closeTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480461594989= , lastSndTime=3D1480461595001, lastRcvTime=3D1480461595001, readsPaused=3Dfal= se, fil terChain=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do.a.i.i.util.nio.GridDirectParser@9df8cc3, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]] class org.apache.ignite.IgniteException: Invalid message type: -84 =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6. [23:19:55,058][WARNING][grid-nio-worker-0-#37%null%][TcpCommunicationSpi] *Failed to process selector key* (will close): GridSelectorNioSessionImpl [selectorIdx=3D0, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer [pos=3D0 lim=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos= =3D5 lim=3D420 cap=3D32768], recovery=3Dnull, super=3DGridNioSessionImpl [locAdd= r=3D/ 172.20.29.33:47100, rmtAddr=3D/172.26.116.67:54660, createTime=3D148046149 2362, closeTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480461= 594989, lastSndTime=3D1480461595011, lastRcvTime=3D1480461595011, readsPaused=3Dfal= se, filterChain=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do .a.i.i.util.nio.GridDirectParser@9df8cc3, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]] [23:19:55,058][SEVERE][grid-nio-worker-0-#37%null%][TcpCommunicationSpi] Closing NIO session because of unhandled exception. =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6 [23:19:55,610][INFO][exchange-worker-#47%null%][GridCachePartitionExchange= Manager] Skipping rebalancing (nothing scheduled) [top=3DAffinityTopologyVersion [topVer=3D1, minorTopVer=3D0], evt=3DNODE_JOINED, node=3D1af3184 3-483e-42b0-90e3-0afc7f772d61] =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6 [23:19:55,351][WARNING][main][GridCacheProcessor] Eviction policy not enabled with ONHEAP_TIERED mode for cache (entries will not be moved to off-heap store): default =E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6. [23:19:55,810][INFO][main][GridDiscoveryManager] *Topology snapshot [ver=3D= 1, servers=3D1, clients=3D0, CPUs=3D2, heap=3D1.0GB]* [23:20:05,051][INFO][ignite-update-notifier-timer][GridUpdateNotifier] Your version is up to date. The servers are not joining. Can you please help. I am attaching both the log files for reference. Let me know if you need any further information. Thanks & Regards, Piali Mazumder Nath VM1-ignite-6a09c54d.0.log (169K) VM2-ignite-1af31843.0.log (152K) -- View this message in context: http://apache-ignite-users.70518.x6.nabble.co= m/Unable-to-create-cluster-of-Apache-Ignite-Server-Containers-running-on-in= dividual-VMs-tp9287.html Sent from the Apache Ignite Users mailing list archive at Nabble.com. ------=_Part_61301_132253402.1480480041874 Content-Type: text/html; charset=UTF8 Content-Transfer-Encoding: quoted-printable

Hi,


I am trying to create a cluster of apache ignite ser= ver containers but unable to bring it up.


Setup:

To start, I have created two VMs on two separate hos= t machines and trying to launch one Apache Ignite server container (Docker)= on each VMs .

The VMs are accessible using floating IP (e.g., VM1-= 172.26.116.67, VM2-172.26.116.150) and containers are using host networking. =

The containers are also pinging each other.


Testing:=C2=A0

I am using the $IGNITE_HOME/bin/ignite.sh, but have = changed the default configuration to enable discovery.

=C2=A0

=C2=A0=C2=A0=C2=A0 <bean id=3D"grid.cfg" class=3D"org.apache.ignite.configuration.IgniteConfiguration">=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"cacheConfiguration">

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <bean class=3D"org.apache.ignite.configuration.CacheConfiguration"><= span>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <= ;property name=3D"offHeapMaxMemory" value=3D"0"/>=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </bean>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </property>=

=C2=A0

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <!-- Explicitly configure T= CP discovery SPI to provide list of initial nodes. --><= /p>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"discoverySpi">

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 <bean class=3D"org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">=

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0<property name=3D"localPort" value=3D"47500"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"networkTimeout&= quot; value=3D"20000" />

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"ipFinder">

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <!--bean class=3D"org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryV= mIpFinder" -->

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <bean class=3D"org.apache.ignite.spi.discovery.tcp.ipfinder.cloud.TcpDiscove= ryCloudIpFinder"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 <property name=3D"addresses">

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 <list>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <!-- In distributed environment, replace with actual host IP address.> -->

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <value>127.0.0= .1:47100..47509</value>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <value>172.26.116.67:47100..47509</value>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 </list>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 </property>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </bean><= /span>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </property>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"ackTimeout" value=3D"50"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"socketTimeout" value=3D"200"/><= /span>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"heartbeatFrequency" value=3D"100"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 </bean>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </property><= span>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <property name=3D"communicationSpi">

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 <be= an class=3D"org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi&q= uot;>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 <!-- Override local port. -->

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 <property name=3D"localPort" value=3D"47100"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 <property name=3D"sharedMemoryPort" value=3D"-1"/>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </b= ean>

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 </property>=

=C2=A0=C2=A0=C2=A0 </bean>

Issue:

<= b>

When I start the 1st Apache Ignite server= container on VM1, I see warnings related to remote node GC pauses even though I tuned= off heap memory (<property name=3D"offHeapMaxMemory" value=3D"0"/>), and= also, no remote node is running. (verified through visor)


[22:55:24,132][INFO][main][TcpCom= municationSpi] Successfully bound to TCP port [port=3D47100, locHost=3D0.0.0.0/0.0.0.0]<= span>

[22:55:24,854][INFO][main][TcpDis= coverySpi] Successfully bound to TCP port [port=3D47500, localHost=3D0.0.0.= 0/0.0.0.0]

[22:55:26,379][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D50]

[22:55:26,482][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D100]

[22:55:26,684][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D200]

[22:55:27,086][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D400]

[22:55:27,888][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D800]

[22:55:29,491][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D1600]

[22:55:32,696][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D3200]

[22:55:39,098][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D6400]

[22:55:51,904][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D12800]

[22:56:17,523][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D25600]

[22:56:18,033][WARNING][main][Gri= dCacheProcessor] Eviction policy not enabled with ONHEAP_TIERED mode for cache (entries will not be m= oved to off-heap store): default

[22:56:18,120][SEVERE][grid-nio-w= orker-1-#38%null%][GridDirectParser] Failed to read message [msg=3Dnull, buf=3Djava.nio.DirectByteBuffer[pos=3D5 lim= =3D420 cap=3D32768], reader=3Dnull, ses=3DGridSelectorNioSessionImpl [sele

ctorIdx=3D1, queueSize=3D1, write= Buf=3Djava.nio.DirectByteBuffer[pos=3D0 lim=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim= =3D420 cap=3D32768], recovery=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/127.0.0.1:47

100, rmtAddr=3D/127.0.= 0.1:48819, createTime=3D1480460126370, closeTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480460178019, lastSndTime=3D1480460178114, lastRcvTime=3D1480460178114, readsPaused=3Dfalse, filterChai

n=3DFilterChain[filters=3D[GridNi= oCodecFilter [parser=3Do.a.i.i.util.nio.GridDirectParser@330f5ec2, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]]

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6

[22:56:18,127][WARNING][grid-nio-= worker-1-#38%null%][TcpCommunicationSpi] Failed to process selector key (will close): GridSelectorNioSessionImpl [selectorIdx=3D1, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer=

[pos=3D0 lim=3D32768 cap=3D32768]= , readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim=3D420 cap=3D32768], recover= y=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/127.0.0.1:47100, rmtA= ddr=3D/127.0.0.1:48819, createTime=3D1480460126370, c

loseTime=3D0, bytesSent=3D0, byte= sRcvd=3D420, sndSchedTime=3D1480460178019, lastSndTime=3D1480460178114, lastRcvTime=3D1480460178114, readsPaused=3Dfalse, filterChain=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do.a.i.i.<= span>

util.nio.GridDirectParser@330f5ec= 2, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]

[22:56:18,127][SEVERE][grid-nio-w= orker-1-#38%null%][TcpCommunicationSpi] Closing NIO session because of unhandled exception.

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6..

=C2=A0

After few seconds this node starts with the below lo= gs

[22:56:18,573][INFO][exchange-wor= ker-#47%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=3DAffinityTopologyVersion [topVer=3D1, minorTopVer=3D0], evt=3DNODE_JOINED, node=3D6a09c54

d-9cbd-42f8-bdb9-522dc438ce1c]

=E2=80=A6=E2=80=A6=E2=80=A6=

[22:56:18,633][INFO][main][GridDi= scoveryManager] Topology snapshot [ver=3D1, servers=3D1, clients=3D0, CPUs=3D2, heap=3D1.0GB]

[22:56:28,053][INFO][ignite-updat= e-notifier-timer][GridUpdateNotifier] Your version is up to date.

=C2=A0=C2=A0

As soon as I start the 2nd Apache Ignite = server container on VM2, I get the below logs although I have increased the default network timeout (= <property name=3D"networkTimeout" value=3D"20000" />)

[23:18:10,313][INFO][main][TcpCom= municationSpi] Successfully bound to TCP port [port=3D47100, locHost=3D0.0.0.0/= 0.0.0.0]

[23:18:10,946][INFO][main][TcpDis= coverySpi] Successfully bound to TCP port [port=3D47500, localHost=3D0.0.0.= 0/0.0.0.0]

[23:18:12,410][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D50]

[23:18:12,512][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D100]

[23:18:12,715][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D200]

[23:18:13,117][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D400]

[23:18:13,919][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D800]

[23:18:15,523][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D1600]

[23:18:18,728][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D3200]

[23:18:25,136][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D6400]

[23:18:37,942][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D12800]

[23:19:03,569][WARNI= NG][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in l= ong GC pauses on remote node) [curTimeout=3D25600]

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6.

[23:19:55,021][SEVERE][grid-nio-w= orker-1-#38%null%][GridDirectParser] Failed to read message [msg=3Dnull, buf=3Djava.nio.DirectByteBuffer[pos=3D5 lim= =3D420 cap=3D32768], reader=3Dnull, ses=3DGridSelectorNioSessionImpl [sele

ctorIdx=3D1, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer[pos=3D0 lim=3D32768 cap=3D32768], readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim=3D420 cap=3D32768], recover= y=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/172.20.29.33

:47100, rmtAddr=3D/172.26.116.67:41952, createTime=3D1480461492404, closeTime=3D0, bytesSent=3D0, bytesRcvd=3D420, sndSchedTime=3D1480461594989= , lastSndTime=3D1480461595001, lastRcvTime=3D1480461595001, readsPaused=3Dfal= se, fil

terChain=3DFilterChain[filters=3D= [GridNioCodecFilter [parser=3Do.a.i.i.util.nio.GridDirectParser@9df8cc3, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]]

class org.apache.ignite.IgniteExc= eption: Invalid message type: -84

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6=E2=80=A6.

[23:19:55,058][WARNING][grid-nio-= worker-0-#37%null%][TcpCommunicationSpi] Failed to process selector key (will close): GridSelectorNioSessionImpl [selectorIdx=3D0, queueSize=3D1, writeBuf=3Djava.nio.DirectByteBuffer=

[pos=3D0 lim=3D32768 cap=3D32768]= , readBuf=3Djava.nio.DirectByteBuffer[pos=3D5 lim=3D420 cap=3D32768], recover= y=3Dnull, super=3DGridNioSessionImpl [locAddr=3D/172.20.29.33:47100= , rmtAddr=3D/172.26.116.67:54660, createTime=3D148046149

2362, closeTime=3D0, bytesSent=3D= 0, bytesRcvd=3D420, sndSchedTime=3D1480461594989, lastSndTime=3D1480461595011, lastRcvTime=3D1480461595011, readsPaused=3Dfalse, filterChain=3DFilterChain[filters=3D[GridNioCodecFilter [parser=3Do

.a.i.i.util.nio.GridDirectParser@= 9df8cc3, directMode=3Dtrue], GridConnectionBytesVerifyFilter], accepted=3Dtrue]]

[23:19:55,058][SEVERE][grid-nio-w= orker-0-#37%null%][TcpCommunicationSpi] Closing NIO session because of unhandled exception.

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6=E2=80=A6

=C2=A0[23:19:55,610][INFO][exchan= ge-worker-#47%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=3DAffinityTopologyVersion [to= pVer=3D1, minorTopVer=3D0], evt=3DNODE_JOINED, node=3D1af3184

3-483e-42b0-90e3-0afc7f772d61]

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6=E2=80=A6

[23:19:55,351][WARNING][main][Gri= dCacheProcessor] Eviction policy not enabled with ONHEAP_TIERED mode for cache (entries will= not be moved to off-heap store): default

=E2=80=A6=E2=80=A6=E2=80=A6=E2=80= =A6=E2=80=A6=E2=80=A6=E2=80=A6.

[23:19:55,810][INFO][main][GridDi= scoveryManager] Topology snapshot [ver=3D1, servers=3D1, clients=3D0, CPUs=3D2, heap=3D1.0GB]

[23:20:05,051][INFO][ignite-updat= e-notifier-timer][GridUpdateNotifier] Your version is up to date.

=C2=A0

The servers are not joining.

Can you please help.

<= br>

I am attaching both the log files for referen= ce.

Let me know if you need any further information.

=C2=A0

Thanks & Re= gards,

Piali Mazumder = Nath

=C2=A0

=C2=A0



=C2=A0

VM1= -ignite-6a09c54d.0.log (169K) Download Attachment
VM2-ignite-1af31843.0.log (152K) Download = Attachment
=09 =09 =09

View this message in context: Unable to create cluster of Apache Ign= ite Server Containers running on individual VMs
Sent from the A= pache Ignite Users mailing list archive at Nabble.com.
------=_Part_61301_132253402.1480480041874--