ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <zhangshuai.u...@gmail.com>
Subject OOM when using Ignite as HDFS Cache
Date Wed, 12 Apr 2017 09:28:34 GMT
Hi there,

 

I’d like to use Ignite as HDFS Cache in my cluster but failed with OOM error. Could you
help to review my configuration to help avoid it?

 

I’m using DUAL_ASYNC mode. The Ignite nodes can find each other to establish the cluster.
There are very few changes in default-config.xml but attached for your review. The JVM heap
size is limited to 1GB. The Ignite suffers from OOM exception when I’m running Hadoop benchmark
TestDFSIO writing 4*4GB files. I think writing 4GB file to HDFS is in streaming so Ignite
should work with it. It’s acceptable to slow down the write performance to wait Ignite write
cached data to HDFS but not acceptable to lead crash or data lost.

 

The ignite log is attached as ignite_log.zip, pick some key messages here:

 

17/04/12 00:49:17 INFO [grid-timeout-worker-#19%null%] internal.IgniteKernal: 

Metrics for local node (to disable set 'metricsLogFrequency' to 0)

    ^-- Node [id=9b5dcc35, name=null, uptime=00:26:00:254]

    ^-- H/N/C [hosts=173, nodes=173, CPUs=2276]

    ^-- CPU [cur=0.13%, avg=0.82%, GC=0%]

    ^-- Heap [used=555MB, free=43.3%, comm=979MB]

    ^-- Non heap [used=61MB, free=95.95%, comm=62MB]

    ^-- Public thread pool [active=0, idle=0, qSize=0]

    ^-- System thread pool [active=0, idle=6, qSize=0]

    ^-- Outbound messages queue [size=0]

17/04/12 00:50:06 INFO [disco-event-worker-#35%null%] discovery.GridDiscoveryManager: Added
new node to topology: TcpDiscoveryNode [id=553b5c1a-da0b-43cb-b691-b842352b3105, addrs=[0:0:0:0:0:0:0:1,
10.152.133.46, 10.55.68.223, 127.0.0.1, 192.168.1.1], sockAddrs=[BN1APS0A98852E/10.152.133.46:47500,
bn1sch010095221.phx.gbl/10.55.68.223:47500, /0:0:0:0:0:0:0:1:47500, /192.168.1.1:47500, /127.0.0.1:47500],
discPort=47500, order=176, intOrder=175, lastExchangeTime=1491983403106, loc=false, ver=2.0.0#20170405-sha1:2c830b0d,
isClient=false]

[00:50:06] Topology snapshot [ver=176, servers=174, clients=0, CPUs=2288, heap=180.0GB]

...

Exception in thread "igfs-client-worker-2-#585%null%" java.lang.OutOfMemoryError: GC overhead
limit exceeded

  at java.util.Arrays.copyOf(Arrays.java:3332)

  at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)

  at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)

  at java.lang.StringBuffer.append(StringBuffer.java:270)

  at java.io.StringWriter.write(StringWriter.java:112)

  at java.io.PrintWriter.write(PrintWriter.java:456)

  at java.io.PrintWriter.write(PrintWriter.java:473)

  at java.io.PrintWriter.print(PrintWriter.java:603)

  at java.io.PrintWriter.println(PrintWriter.java:756)

  at java.lang.Throwable$WrappedPrintWriter.println(Throwable.java:764)

  at java.lang.Throwable.printStackTrace(Throwable.java:658)

  at java.lang.Throwable.printStackTrace(Throwable.java:721)

  at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60)

  at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)

  at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)

  at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:162)

  at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)

  at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)

  at org.apache.log4j.Category.callAppenders(Category.java:206)

  at org.apache.log4j.Category.forcedLog(Category.java:391)

  at org.apache.log4j.Category.error(Category.java:322)

  at org.apache.ignite.logger.log4j.Log4JLogger.error(Log4JLogger.java:495)

  at org.apache.ignite.internal.GridLoggerProxy.error(GridLoggerProxy.java:148)

  at org.apache.ignite.internal.util.IgniteUtils.error(IgniteUtils.java:4281)

  at org.apache.ignite.internal.util.IgniteUtils.error(IgniteUtils.java:4306)

  at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:126)

  at java.lang.Thread.run(Thread.java:745)

Exception in thread "LeaseRenewer:hadoop@namenode-vip.yarn3-dev-bn2.bn2.ap.gbl" java.lang.OutOfMemoryError:
GC overhead limit exceeded

Exception in thread "igfs-delete-worker%igfs%9b5dcc35-3a4c-4a90-ac9e-89fdd65302a7%" java.lang.OutOfMemoryError:
GC overhead limit exceeded

Exception in thread "exchange-worker-#39%null%" java.lang.OutOfMemoryError: GC overhead limit
exceeded

…

17/04/12 01:40:10 WARN [disco-event-worker-#35%null%] discovery.GridDiscoveryManager: Stopping
local node according to configured segmentation policy.

 

Looking forward to your help.

 

 

Regards,

Shuai Zhang


Mime
View raw message