I have a three node Ignite 2.6 cluster setup with the following config.
node1:49500
node2:49500
node3:49500
And I used this command to start Ignite service on three nodes.
./ignite.sh -J-Xmx32000m -J-Xms32000m -J-XX:+UseG1GC
-J-XX:+ScavengeBeforeFullGC -J-XX:+DisableExplicitGC -J-XX:+AlwaysPreTouch
-J-XX:+PrintGCDetails -J-XX:+PrintGCTimeStamps -J-XX:+PrintGCDateStamps
-J-XX:+PrintAdaptiveSizePolicy -XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
-J-Xloggc:/spare/ignite/log/ignitegc-$(date +%Y_%m_%d-%H_%M).log
config/persistent-config.xml
When I'm using Spark dataframe API to ingest data into this cluster, the
cluster freezes after some time and no new data can be ingested into Ignite.
Both the client(spark executor) and server are showing the "Unable to await
partitions release latch within timeout: ServerLatch" exception starts from
line 51834 in full log like this
[2018-07-25T09:45:42,177][WARN
][exchange-worker-#162][GridDhtPartitionsExchangeFuture] Unable to await
partitions release latch within timeout: ServerLatch [permits=2,
pendingAcks=[429edc2b-eb14-414f-a978-9bfe35443c8c,
6783732c-9a13-466f-800a-ad4c8d9be3bf], super=Completab leLatch
[id=exchange, topVer=AffinityTopologyVersion [topVer=239, minorTopVer=0]]]
Here's the full log on server node having the exception.
07-25.zip
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/