ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron" <aa...@tophold.com>
Subject Re:Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node) java.lang.OutOfMemoryError: Java heap space
Date Fri, 21 Jul 2017 14:55:51 GMT

Thansk Andrey! thanks for your quick response!!
if set this as false, will this meaning total cache data be stored in JVM? so we can not leverage
big off heap cache adventage?  also If i disable this seem the evictionpolicy should not
be set neither:
org.apache.ignite.IgniteCheckedException: Onheap cache must be enabled if eviction policy
is configured 
Will this mean the cache will increase infinite?  also will this conflict with the memory
policy?  what's they relationship ? 
in the DEBUG log found a lot Timeout message, is this normal?
[DEBUG] 2017-07-21 06:46:12.682 [grid-timeout-worker-#23%null%] [ig] GridTimeoutProcessor
- Timeout has occurred: GridCommunicationMessageSet [nodeId=58be665c-d5f8-4328-897e-962e183ed6b8,
endTime=1500619572679, timeoutId=19f75e36d51-9528b3e9-0216-47d4-a59e-aa5c34542a36, topic=T4
[topic=TOPIC_CACHE, id1=5dc785b4-dd3d-3c3b-b270-c5fe2d7ed9a2, id2=0a29b294-b9ab-4f7a-9999-e1c483c37fa0,
id3=1], plc=2, msgs=[GridTuple3 [val1=GridIoMessage [plc=2, topic=T4 [topic=TOPIC_CACHE, id1=5dc785b4-dd3d-3c3b-b270-c5fe2d7ed9a2,
id2=0a29b294-b9ab-4f7a-9999-e1c483c37fa0, id3=1], topicOrd=-1, ordered=true, timeout=0, skipOnTimeout=true,
msg=GridContinuousMessage [type=MSG_EVT_NOTIFICATION, routineId=940e6318-db67-44a4-9967-d7876a33d5f5,
data=null, futId=null]], val2=1500619572679, val3=GridNioMessageTracker [ses=GridSelectorNioSessionImpl
[worker=DirectNioClientWorker [super=AbstractNioClientWorker [selector=sun.nio.ch.EPollSelectorImpl@1458ed9c,
idx=2, bytesRcvd=672, bytesSent=28, bytesRcvd0=672, bytesSent0=28, select=true, super=GridWorker
[name=grid-nio-worker-tcp-comm-2, igniteInstanceName=null, finished=false, hashCode=1599483950,
interrupted=false, runner=grid-nio-worker-tcp-comm-2-#27%null%]]], writeBuf=java.nio.DirectByteBuffer[pos=0
lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor
[acked=0, resendCnt=0, rcvCnt=1, sentCnt=0, reserved=true, lastAck=0, nodeLeft=false, node=TcpDiscoveryNode
[id=58be665c-d5f8-4328-897e-962e183ed6b8, addrs=[10.31.23.18], sockAddrs=[iZuf62zdiq684kn72aatgjZ/10.31.23.18:47507],
discPort=47507, order=8, intOrder=8, lastExchangeTime=1500619572257, loc=false, ver=2.0.0#20170430-sha1:d4eef3c6,
isClient=false], connected=true, connectCnt=0, queueLimit=128000, reserveCnt=1, pairedConnections=false],
outRecovery=GridNioRecoveryDescriptor [acked=0, resendCnt=0, rcvCnt=1, sentCnt=0, reserved=true,
lastAck=0, nodeLeft=false, node=TcpDiscoveryNode [id=58be665c-d5f8-4328-897e-962e183ed6b8,
addrs=[10.31.23.18], sockAddrs=[iZuf62zdiq684kn72aatgjZ/10.31.23.18:47507], discPort=47507,
order=8, intOrder=8, lastExchangeTime=1500619572257, loc=false, ver=2.0.0#20170430-sha1:d4eef3c6,
isClient=false], connected=true, connectCnt=0, queueLimit=128000, reserveCnt=1, pairedConnections=false],
super=GridNioSessionImpl [locAddr=/10.31.23.18:47111, rmtAddr=/10.31.23.18:52540, createTime=1500619572668,
closeTime=0, bytesSent=28, bytesRcvd=672, bytesSent0=28, bytesRcvd0=672, sndSchedTime=1500619572668,
lastSndTime=1500619572668, lastRcvTime=1500619572668, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter
[parser=org.apache.ignite.internal.util.nio.GridDirectParser@141e5bef, directMode=true], GridConnectionBytesVerifyFilter],
accepted=true]], msgQueueLimit=1000, msgCnt=0, paused=false, hash=686703677]]], reserved=false,
timeout=5000, skipOnTimeout=true, lastTs=1500619572679]

Also if i use the JDBC IP finder, seem when the Ignite exit the JDBC data some time not be
updated, or need I add a hook manuall to do the clean up? otherwise this will left some stale
data there, is this normal? 
Thanks again for your time! have a good weekend!

RegardsAaron


:Re: Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart
the node) java.lang.OutOfMemoryError: Java heap space
Hi Aaron,
1. You have  <property name="onheapCacheEnabled" value="true"/> for caches that force
Ignite to have a copy of data in OnHeap.2. You have DEBUG log level, so you see a lot of
messages.3. I see no OOM issues in logs. Try to get a heap dump when OOM occurs next time
to investigate who helds waste data. 
On Fri, Jul 21, 2017 at 1:26 PM, Andrey Mashenkov <andrey.mashenkov@gmail.com> wrote:
Hi Aaron,
I'll take a look at logs.
Yes, every node should have own work directory as there can be races on Marshaller work files
creation\deletion.

On Fri, Jul 21, 2017 at 10:58 AM, aaron@tophold.com <aaron@tophold.com> wrote:
Hi Andrey
Thanks!  I set the IPV4 = true when JVM bring up.  Other logs and configuration as attached. 
conf.zip: all notes's configuration log.zip include the log for some node,   price: which
feed the price to the grid, frequent may hundreds or thousands/seconds. Price cache with  node
filter only nodes care price, like the transaction node. But now in fact no transaction loaded.
so no much data in the grid only hundreds. the web is totally a client node, nothing configured
as default. top.json,   topology of the entire gridIP finder is by JDBC way. 
The log may reach 10+G several mins... 
From log GridDhtLocalPartition  & GridTimeoutProcessor occupy a lot message changes
SPI:
TcpDiscoverySpi spi = new TcpDiscoverySpi();
spi.setJoinTimeout(60_1000l * 5);
spi.setNetworkTimeout(3000);

MemoryEventStorageSpi spi = new MemoryEventStorageSpi();
spi.setExpireAgeMs(60 * 60_1000l); //1hrs


TcpCommunicationSpi spi = new TcpCommunicationSpi();
if (!StringUtils.isEmpty(localAddress)) {
	System.out.print("Local address overwrite to: " + localAddress);
	spi.setLocalAddress(localAddress);
}

spi.setConnectTimeout(3000);
spi.setMessageQueueLimit(1000);
spi.setSlowClientQueueLimit(500);
cfg.setCommunicationSpi(spi);

JVM
java -Xms4g -Xmx4g -server -XX:+AggressiveOpts -XX:MaxMetaspaceSize=256m -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Xloggc:/var/tmp/qa/query/1500619508_6525/query-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps  -DLOCAL_ADDRESS=10.31.23.18  -Djava.awt.headless=true -DFIND_STRATEGY=jdbc -DIGNITE_THREAD_DUMP_ON_EXCHANGE_TIMEOUT=true -Duser.timezone=UTC -Djava.net.preferIPv4Stack=true -Dapp.instance.cnt=1500619508_6525 -Dapp.instance=query -Dapp.env=qa -Dapp.user=th_qa -Dtemp.dir=/var/tmp/qa/query/1500619508_6525 -DMEM_POLICY=xl
When run in local I find a lot those exception, will need to specific the work directory
cfg.setWorkDirectory(); for each node if they are kick off on same path?
MarshallerMappingFileStore - Failed to write class name to file [platformId=0id=-556541136, clsName=com.tophold.trade.ignite.service.CommandRemoteService, file=D:\workspace\code\trade_engine\work\marshaller\-556541136.classname0]
java.io.FileNotFoundException: D:\workspace\code\trade_engine\work\marshaller\-556541136.classname0 
	at java.io.FileOutputStream.open0(Native Method) ~[?:1.8.0_102]
	at java.io.FileOutputStream.open(FileOutputStream.java:270) ~[?:1.8.0_102]
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213) ~[?:1.8.0_102]
	at java.io.FileOutputStream.<init>(FileOutputStream.java:162) ~[?:1.8.0_102]
	at org.apache.ignite.internal.MarshallerMappingFileStore.writeMapping(MarshallerMappingFileStore.java:79) [ignite-core-2.0.0.jar:2.0.0]
	at org.apache.ignite.internal.MappingStoreTask.run(MappingStoreTask.java:57) [ignite-core-2.0.0.jar:2.0.0]
	at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6677) [ignite-core-2.0.0.jar:2.0.0]
	at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:812) [ignite-core-2.0.0.jar:2.0.0]
	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.0.0.jar:2.0.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]

Thanks again for your time very appreciated!!

ThanksAaronaaron@tophold.com From: Andrey MashenkovDate: 2017-07-20 22:36To: userSubject: Re:
Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node)
java.lang.OutOfMemoryError: Java heap spaceHi,
Is it possible nodes can't see each other via communication ports or use different IP protocol
(IPv4 and IPv6)?Can you share grid configuration and logs?
On Thu, Jul 20, 2017 at 7:19 AM, aaron@tophold.com <aaron@tophold.com> wrote:
Thanks Andrew, But really Ignite have so many communication message between nodes. I know
default the includeEventTypes is empty, but if I open the debug model, still saw a lot information
changes between nodes.  there are so many even I can see a lot Timeout even we just keep a
empty grid running...
[DEBUG] 2017-07-20 02:05:16.894 [grid-timeout-worker-#23%WebIgnite%] [ig] GridTimeoutProcessor - Timeout has occurred: GridCommunicationMessageSet [nodeId=eedf67fb-6a01-4eb3-b4ac-e23446181ca9, endTime=1500516316887, timeoutId=24a304d5d51-49983eab-7747-47a8-912d-d06b45404385, topic=T4 [topic=TOPIC_CACHE, id1=5dc785b4-dd3d-3c3b-b270-c5fe2d7ed9a2, id2=e90fb911-c52d-440a-a651-8f624ac7ece5, id3=1], plc=2, msgs=[], reserved=false, timeout=5000, skipOnTimeout=true, lastTs=1500508078116]


RegardsAaronaaron@tophold.com From: Andrey MashenkovDate: 2017-07-20 00:12To: userSubject: Re:
TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node) java.lang.OutOfMemoryError:
Java heap spaceHi,
From ignite-2.0, TcpCommunicationSpi has messageQueueLimit=0.That means outbound message
queue is unlimited. You should see a warning in logs on node startup.
Try to set messageQueueLimit to smth >0.It was 1024 by default in previous versions of
ignite.
On Wed, Jul 19, 2017 at 12:56 PM, aaron@tophold.com <aaron@tophold.com> wrote:
Hi All, We have instance , in fact no operation and  data exist yet, but the memory usage
keep going on until to 6G~
From the heap dump most objects are GridNioRecoveryDescriptor take 4G; is this expected?  
my configuration almost same as the default configuration xml + a cache with CacheJdbcPojoStore;while
in fact there are only hundreds of data in DB now.  we keep get OOM exception as below,
then the node just quite from the cluster.  thanks for your time!!!







[ERROR] 2017-07-19 09:50:51.171 [grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%] [ig]
TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node).java.lang.OutOfMemoryError:
Java heap space        at java.util.ArrayDeque.allocateElements(ArrayDeque.java:142) ~[?:1.8.0_131] 
      at java.util.ArrayDeque.<init>(ArrayDeque.java:198) ~[?:1.8.0_131]     
  at org.apache.ignite.internal.util.nio.GridNioRecoveryDescriptor.<init>(GridNioRecoveryDescriptor.java:99)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.recoveryDescriptor(TcpCommunicationSpi.java:3534)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.inRecoveryDescriptor(TcpCommunicationSpi.java:3493)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$1100(TcpCommunicationSpi.java:241)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onFirstMessage(TcpCommunicationSpi.java:537)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:626)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:342)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:117)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:133)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3062)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:175)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1121)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2031)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1868)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1573)
[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[ignite-core-2.0.0.jar!/:2.0.0]        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131][ERROR]
2017-07-19 09:50:51.171 [grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%] [ig] TcpCommunicationSpi
- Runtime error caught during grid runnable execution: DirectNioClientWorker [super=AbstractNioClientWorker
[selector=sun.nio.ch.EPollSelectorImpl@11c850c6, idx=4, bytesRcvd=42, bytesSent=18, bytesRcvd0=42,
bytesSent0=18, select=false, super=GridWorker [name=grid-nio-worker-tcp-comm-4, igniteInstanceName=PortfolioEventIgnite,
finished=false, hashCode=1422574651, interrupted=false, runner=grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%]]]java.lang.OutOfMemoryError:
Java heap space        at java.util.ArrayDeque.allocateElements(ArrayDeque.java:142) ~[?:1.8.0_131] 
      at java.util.ArrayDeque.<init>(ArrayDeque.java:198) ~[?:1.8.0_131]     
  at org.apache.ignite.internal.util.nio.GridNioRecoveryDescriptor.<init>(GridNioRecoveryDescriptor.java:99)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.recoveryDescriptor(TcpCommunicationSpi.java:3534)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.inRecoveryDescriptor(TcpCommunicationSpi.java:3493)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$1100(TcpCommunicationSpi.java:241)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onFirstMessage(TcpCommunicationSpi.java:537)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:626)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:342)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:117)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:133)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3062)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:175)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1121)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2031)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1868)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1573)
~[ignite-core-2.0.0.jar!/:2.0.0]        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
[ignite-core-2.0.0.jar!/:2.0.0]        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]aaron@tophold.com


-- 
Best regards,
Andrey V. Mashenkov



-- 
Best regards,
Andrey V. Mashenkov



-- 
Best regards,
Andrey V. Mashenkov



-- 
Best regards,
Andrey V. Mashenkov

Mime
View raw message