ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "aaron@tophold.com" <aa...@tophold.com>
Subject Re: Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node) java.lang.OutOfMemoryError: Java heap space
Date Fri, 21 Jul 2017 07:58:27 GMT
Hi Andrey

Thanks!  I set the IPV4 = true when JVM bring up.  Other logs and configuration as attached.


conf.zip: all notes's configuration 
log.zip include the log for some node,   price: which feed the price to the grid, frequent
may hundreds or thousands/seconds. Price cache with  node filter only nodes care price, like
the transaction node. 
But now in fact no transaction loaded. so no much data in the grid only hundreds. 
the web is totally a client node, nothing configured as default. 
top.json,   topology of the entire grid
IP finder is by JDBC way. 

The log may reach 10+G several mins... 

From log GridDhtLocalPartition  & GridTimeoutProcessor occupy a lot message changes

SPI:

TcpDiscoverySpi spi = new TcpDiscoverySpi();
spi.setJoinTimeout(60_1000l * 5);
spi.setNetworkTimeout(3000);

MemoryEventStorageSpi spi = new MemoryEventStorageSpi();
spi.setExpireAgeMs(60 * 60_1000l); //1hrs


TcpCommunicationSpi spi = new TcpCommunicationSpi();
if (!StringUtils.isEmpty(localAddress)) {
System.out.print("Local address overwrite to: " + localAddress);
spi.setLocalAddress(localAddress);
}

spi.setConnectTimeout(3000);
spi.setMessageQueueLimit(1000);
spi.setSlowClientQueueLimit(500);
cfg.setCommunicationSpi(spi);


JVM

java -Xms4g -Xmx4g -server -XX:+AggressiveOpts -XX:MaxMetaspaceSize=256m -XX:+UseG1GC -XX:MaxGCPauseMillis=20
-XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Xloggc:/var/tmp/qa/query/1500619508_6525/query-gc.log
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps  -DLOCAL_ADDRESS=10.31.23.18
 -Djava.awt.headless=true -DFIND_STRATEGY=jdbc -DIGNITE_THREAD_DUMP_ON_EXCHANGE_TIMEOUT=true
-Duser.timezone=UTC -Djava.net.preferIPv4Stack=true -Dapp.instance.cnt=1500619508_6525 -Dapp.instance=query
-Dapp.env=qa -Dapp.user=th_qa -Dtemp.dir=/var/tmp/qa/query/1500619508_6525 -DMEM_POLICY=xl

When run in local I find a lot those exception, will need to specific the work directory cfg.setWorkDirectory();
for each node if they are kick off on same path?

MarshallerMappingFileStore - Failed to write class name to file [platformId=0id=-556541136,
clsName=com.tophold.trade.ignite.service.CommandRemoteService, file=D:\workspace\code\trade_engine\work\marshaller\-556541136.classname0]
java.io.FileNotFoundException: D:\workspace\code\trade_engine\work\marshaller\-556541136.classname0

at java.io.FileOutputStream.open0(Native Method) ~[?:1.8.0_102]
at java.io.FileOutputStream.open(FileOutputStream.java:270) ~[?:1.8.0_102]
at java.io.FileOutputStream.<init>(FileOutputStream.java:213) ~[?:1.8.0_102]
at java.io.FileOutputStream.<init>(FileOutputStream.java:162) ~[?:1.8.0_102]
at org.apache.ignite.internal.MarshallerMappingFileStore.writeMapping(MarshallerMappingFileStore.java:79)
[ignite-core-2.0.0.jar:2.0.0]
at org.apache.ignite.internal.MappingStoreTask.run(MappingStoreTask.java:57) [ignite-core-2.0.0.jar:2.0.0]
at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6677) [ignite-core-2.0.0.jar:2.0.0]
at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:812)
[ignite-core-2.0.0.jar:2.0.0]
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.0.0.jar:2.0.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]


Thanks again for your time very appreciated!!


Thanks
Aaron


aaron@tophold.com
 
From: Andrey Mashenkov
Date: 2017-07-20 22:36
To: user
Subject: Re: Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart
the node) java.lang.OutOfMemoryError: Java heap space
Hi,

Is it possible nodes can't see each other via communication ports or use different IP protocol
(IPv4 and IPv6)?
Can you share grid configuration and logs?

On Thu, Jul 20, 2017 at 7:19 AM, aaron@tophold.com <aaron@tophold.com> wrote:
Thanks Andrew, But really Ignite have so many communication message between nodes. 
I know default the includeEventTypes is empty, but if I open the debug model, still saw a
lot information changes between nodes. there are so many even I can see a lot Timeout even
we just keep a empty grid running...

[DEBUG] 2017-07-20 02:05:16.894 [grid-timeout-worker-#23%WebIgnite%] [ig] GridTimeoutProcessor
- Timeout has occurred: GridCommunicationMessageSet [nodeId=eedf67fb-6a01-4eb3-b4ac-e23446181ca9,
endTime=1500516316887, timeoutId=24a304d5d51-49983eab-7747-47a8-912d-d06b45404385, topic=T4
[topic=TOPIC_CACHE, id1=5dc785b4-dd3d-3c3b-b270-c5fe2d7ed9a2, id2=e90fb911-c52d-440a-a651-8f624ac7ece5,
id3=1], plc=2, msgs=[], reserved=false, timeout=5000, skipOnTimeout=true, lastTs=1500508078116]



Regards
Aaron


aaron@tophold.com
 
From: Andrey Mashenkov
Date: 2017-07-20 00:12
To: user
Subject: Re: TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart
the node) java.lang.OutOfMemoryError: Java heap space
Hi,

From ignite-2.0, TcpCommunicationSpi has messageQueueLimit=0.
That means outbound message queue is unlimited. You should see a warning in logs on node startup.

Try to set messageQueueLimit to smth >0.
It was 1024 by default in previous versions of ignite.

On Wed, Jul 19, 2017 at 12:56 PM, aaron@tophold.com <aaron@tophold.com> wrote:
Hi All, 


We have instance , in fact no operation and  data exist yet, but the memory usage keep going
on until to 6G~

From the heap dump most objects are GridNioRecoveryDescriptor take 4G; is this expected? 
 my configuration almost same as the default configuration xml + a cache with CacheJdbcPojoStore;

while in fact there are only hundreds of data in DB now.  we keep get OOM exception as below,
then the node just quite from the cluster.  thanks for your time!!!








[ERROR] 2017-07-19 09:50:51.171 [grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%] [ig]
TcpCommunicationSpi - Caught unhandled exception in NIO worker thread (restart the node).
java.lang.OutOfMemoryError: Java heap space
        at java.util.ArrayDeque.allocateElements(ArrayDeque.java:142) ~[?:1.8.0_131]
        at java.util.ArrayDeque.<init>(ArrayDeque.java:198) ~[?:1.8.0_131]
        at org.apache.ignite.internal.util.nio.GridNioRecoveryDescriptor.<init>(GridNioRecoveryDescriptor.java:99)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.recoveryDescriptor(TcpCommunicationSpi.java:3534)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.inRecoveryDescriptor(TcpCommunicationSpi.java:3493)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$1100(TcpCommunicationSpi.java:241)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onFirstMessage(TcpCommunicationSpi.java:537)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:626)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:342)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:117)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:133)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3062)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:175)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1121)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2031)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1868)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1573)
[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.0.0.jar!/:2.0.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[ERROR] 2017-07-19 09:50:51.171 [grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%] [ig]
TcpCommunicationSpi - Runtime error caught during grid runnable execution: DirectNioClientWorker
[super=Abstr
actNioClientWorker [selector=sun.nio.ch.EPollSelectorImpl@11c850c6, idx=4, bytesRcvd=42, bytesSent=18,
bytesRcvd0=42, bytesSent0=18, select=false, super=GridWorker [name=grid-nio-worker-tcp-comm-4,
ign
iteInstanceName=PortfolioEventIgnite, finished=false, hashCode=1422574651, interrupted=false,
runner=grid-nio-worker-tcp-comm-4-#29%PortfolioEventIgnite%]]]
java.lang.OutOfMemoryError: Java heap space
        at java.util.ArrayDeque.allocateElements(ArrayDeque.java:142) ~[?:1.8.0_131]
        at java.util.ArrayDeque.<init>(ArrayDeque.java:198) ~[?:1.8.0_131]
        at org.apache.ignite.internal.util.nio.GridNioRecoveryDescriptor.<init>(GridNioRecoveryDescriptor.java:99)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.recoveryDescriptor(TcpCommunicationSpi.java:3534)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.inRecoveryDescriptor(TcpCommunicationSpi.java:3493)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$1100(TcpCommunicationSpi.java:241)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onFirstMessage(TcpCommunicationSpi.java:537)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:626)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2.onMessage(TcpCommunicationSpi.java:342)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:117)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:133)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:3062)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:175)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:1121)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2031)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1868)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1573)
~[ignite-core-2.0.0.jar!/:2.0.0]
        at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) [ignite-core-2.0.0.jar!/:2.0.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]


aaron@tophold.com



-- 
Best regards,
Andrey V. Mashenkov



-- 
Best regards,
Andrey V. Mashenkov
Mime
View raw message