giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darshan Mallenahalli Shankaralingappa <dshankaralinga...@ntent.com>
Subject Re: Problems with running page rank using OutOfCore setting
Date Mon, 17 Jul 2017 10:44:17 GMT
Hi,

I added -Dgiraph.waitForPerWorkerRequests=true parameter. And I got this error.


Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.ArrayList$SubList.listIterator(ArrayList.java:1095)
        at java.util.AbstractList.listIterator(AbstractList.java:299)
        at java.util.ArrayList$SubList.iterator(ArrayList.java:1087)
        at java.util.AbstractCollection.toArray(AbstractCollection.java:180)
        at java.util.regex.Pattern.split(Pattern.java:1241)
        at java.util.regex.Pattern.split(Pattern.java:1273)
        at org.apache.giraph.examples.LongDoubleNullTextInputFormat$LongDoubleNullDoubleVertexReader.getCurrentVertex(LongDoubleNullTextInputFormat.java:86)
        at org.apache.giraph.io.internal.WrappedVertexReader.getCurrentVertex(WrappedVertexReader.java:90)
        at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:182)
        at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:275)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:227)
        at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
        at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:67)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)



So, does this mean that there is no other solution but to increase the physical memory?


Cheers,

Darshan


On 16 Jul 2017, at 23:04, Hassan Eslami <hsn.eslami@gmail.com<mailto:hsn.eslami@gmail.com>>
wrote:

Hi,

giraph.useOutOfCoreMessages is no longer in use.

The main problem here is that you are using default flow control mechanism
(NoOpFlowControl), that causes a lot of outstanding/received messages. As a
consequence, you fill up the memory so fast, and the job would fail for
various reasons. Please use the following options instead:

-Dgiraph.isStaticGraph=false -Dgiraph.useOutOfCoreGraph=true
-Dgiraph.waitForPerWorkerRequests=true

Note: the static graph has a known bug with the out-of-core mechanism.

Hope it helps,
Hassan

On Sun, Jul 16, 2017 at 1:54 PM, Darshan Mallenahalli Shankaralingappa <
dshankaralingappa@ntent.com<mailto:dshankaralingappa@ntent.com>> wrote:

Hi,

I am trying to run the page rank algorithm using giraph on a 3.5 billion
node web graph on a relatively smaller Hadoop cluster (6 nodes with 225GB
RAM total).
I set the giraph.useOutOfCoreGraph and giraph.useOutOfCoreMessages to true
and the application killed after some time.

I am running the giraph job using this command:
yarn jar giraph-examples-1.2.0-for-hadoop-2.6.0-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -Dgiraph.yarn.task.heap.mb=58880
-Dgiraph.isStaticGraph=true -Dgiraph.useOutOfCoreGraph=true
-Dgiraph.useOutOfCoreMessages=true org.apache.giraph.examples.PageRankComputation
-vif org.apache.giraph.examples.LongDoubleNullTextInputFormat -vip
/user/darshan/AdjList/ -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/darshan/giraph_3.5B_ooc/ -w 8 -mc org.apache.giraph.examples.RandomWalkVertexMasterCompute
-wc org.apache.giraph.examples.RandomWalkWorkerContext -ca
org.apache.giraph.examples.RandomWalkVertex.teleportationProbability=0.15f
-ca org.apache.giraph.examples.RandomWalkVertex.maxSupersteps=21

Here is a log from the zookeeper:

2017-07-12 08:08:35,026 WARN [netty-client-worker-1]
org.apache.giraph.comm.netty.handler.ResponseClientHandler:
exceptionCaught: Channel failed with remote address <url>/<ip>:30006<
http://hdpbcn-01.lv.ntent.com/10.100.21.118:30006>

java.lang.ArrayIndexOutOfBoundsException: 1075052547
       at org.apache.giraph.comm.flow_control.NoOpFlowControl.
getAckSignalFlag(NoOpFlowControl.java:52)
       at org.apache.giraph.comm.netty.NettyClient.messageReceived(
NettyClient.java:796)
       at org.apache.giraph.comm.netty.handler.ResponseClientHandler.
channelRead(ResponseClientHandler.java:87)
       at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
       at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
       at io.netty.handler.codec.ByteToMessageDecoder.channelRead(
ByteToMessageDecoder.java:153)
       at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
       at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
       at org.apache.giraph.comm.netty.InboundByteCounter.channelRead(
InboundByteCounter.java:74)
       at io.netty.channel.DefaultChannelHandlerContext.
invokeChannelRead(DefaultChannelHandlerContext.java:338)
       at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(
DefaultChannelHandlerContext.java:324)
       at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
DefaultChannelPipeline.java:785)
       at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(
AbstractNioByteChannel.java:126)
       at io.netty.channel.nio.NioEventLoop.processSelectedKey(
NioEventLoop.java:485)
       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
NioEventLoop.java:452)
       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346)
       at io.netty.util.concurrent.SingleThreadEventExecutor$2.
run(SingleThreadEventExecutor.java:101)
       at java.lang.Thread.run(Thread.java:745)


I think this issue is related to the messaging stack rather than the
algorithm.
If not, can someone please help me with this or at least point me in the
right direction?

Cheers,
Darshan


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message