giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hai Lan <lanhai1...@gmail.com>
Subject Re: OutOfMemoryError: Java heap space during Large graph running
Date Sun, 23 Oct 2016 12:36:27 GMT
Thanks Agrta

Thanks for your response. How exact I can do to increase min and max
RAM?(in which conf file or by using any command/arguments? my
giraph-site.xml is empty as default).

As I saw online how to increase the heap size(not sure it is the same thing
like you mentioned min max RAM size), many people suggest to increase:
mapred.child.java.opts OR HADOOP_DATANODE_OPTS

But they are not help. My problem happen during "VertexInputSplitsCallable:
readVertexInputSplit:", so I tried to increase mapreduce.map.memory.mb and
decrease # of container/workers. Currently I'm using 248 workers and
mapreduce.map.memory.mb=12000, ratio=0.7. This can help but I face new
problem:

1. The superstep -1 is extremely slow, like take 7-8 hours to load a 150G
graph:
e.g.
org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 106 out of
248 workers finished on superstep -1 on path /_hadoopBsp/job_1477020594559_
0012/_vertexInputSplitDoneDir

I saw in log like:
INFO [main] org.apache.giraph.comm.netty.NettyClient:
logInfoAboutOpenRequests: Waiting interval of 15000 msecs, 2499 open
requests, waiting for it to be <= 0, MBytes/sec received = 0.0001,
MBytesReceived = 0.0058, ave received req MBytes = 0, secs waited = 92.12
MBytes/sec sent = 10.4373, MBytesSent = 961.4983, ave sent req MBytes =
0.3244, secs waited = 92.12

To finish those 2499 open requests will take a very long time. *I'm not
sure is this normal?*

2. I tried out-of-core graph option but I'm not sure I'm using it correct.
I did add -Dgiraph.useOutOfCoreGraph=true -ca
isStaticGraph=true,giraph.maxPartitionsInMemory=10.
But how I know if it is work?

I doubt when I tried 15T graph, the problem will be worse. What should I do?

Thanks for your help.

Best,
Hai


On Sun, Oct 23, 2016 at 7:11 AM, Agrta Rawat <agrta.rawat@gmail.com> wrote:

> Hi Hai,
>
> Please check your giraph configurations. Try increasing min and max RAM
> size in your configurations.
> This should help.
>
> Regards,
> Agrta Rawat
>
>
> On Sat, Oct 22, 2016 at 7:46 PM, Hai Lan <lanhai1988@gmail.com> wrote:
>
>> Can anyone help with this?
>>
>> Thanks a lot!
>>
>>
>> On Thu, Oct 20, 2016 at 9:48 PM, Hai Lan <lanhai1988@gmail.com> wrote:
>>
>>> Dear all,
>>>
>>> I'm facing a problem when I run large graph job (currently 1.6T, will be
>>> 16T then), it always shows java.lang.OutOfMemoryError: Java heap
>>> space error when loaded specific numbers of vertex(near 59000000). I tried
>>> to add like:
>>> -Dgiraph.useOutOfCoreGraph=true
>>>  -Dmapred.child.java.opts="-XX:-UseGCOverheadLimit" OR
>>> -Dmapred.child.java.opts="-Xmx16384"
>>>  -Dgiraph.yarn.task.heap.mb=36570
>>>
>>> but the problem remain though I can see those value are shown in
>>> Metadata.
>>>
>>> I'm not sure the max value of memory in this VertexInputSplitsCallable
>>> info is related to java heap size.
>>> INFO [load-0] org.apache.giraph.worker.VertexInputSplitsCallable:
>>> readVertexInputSplit: Loaded 46975802 vertices at 68977.49310291892
>>> vertices/sec 0 edges at 0.0 edges/sec Memory (free/total/max) = 475.08M /
>>> 2759.00M / 2759.00M
>>>
>>> But I am noticed in main log, it *always* shows:
>>> INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf:
>>> Task java-opts do not specify heap size. Setting task attempt jvm max heap
>>> size to -Xmx2868m
>>> *no matter what arguments I added*. Even when I run normal Hadoop jobs.
>>>
>>> Any ideas about this? Following is the log.
>>>
>>> 2016-10-20 21:25:49,008 ERROR [netty-client-worker-2]
>>> org.apache.giraph.comm.netty.NettyClient: Request failed
>>> java.lang.OutOfMemoryError: Java heap space
>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB
>>> uf.java:45)
>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo
>>> ledByteBufAllocator.java:43)
>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>> ByteBufAllocator.java:136)
>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>> ByteBufAllocator.java:127)
>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte
>>> BufAllocator.java:85)
>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re
>>> questEncoder.java:81)
>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De
>>> faultChannelHandlerContext.java:645)
>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De
>>> faultChannelHandlerContext.java:29)
>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(
>>> DefaultChannelHandlerContext.java:906)
>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve
>>> ntExecutor.java:36)
>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin
>>> gleThreadEventExecutor.java:101)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 2016-10-20 21:25:55,299 ERROR [netty-client-worker-1]
>>> org.apache.giraph.comm.netty.NettyClient: Request failed
>>> java.lang.OutOfMemoryError: Java heap space
>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB
>>> uf.java:45)
>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo
>>> ledByteBufAllocator.java:43)
>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>> ByteBufAllocator.java:136)
>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>> ByteBufAllocator.java:127)
>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte
>>> BufAllocator.java:85)
>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re
>>> questEncoder.java:81)
>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De
>>> faultChannelHandlerContext.java:645)
>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De
>>> faultChannelHandlerContext.java:29)
>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(
>>> DefaultChannelHandlerContext.java:906)
>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve
>>> ntExecutor.java:36)
>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin
>>> gleThreadEventExecutor.java:101)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 2016-10-20 21:26:06,731 ERROR [main] org.apache.giraph.graph.GraphMapper:
>>> Caught an unrecoverable exception waitFor: ExecutionException occurred
>>> while waiting for org.apache.giraph.utils.Progre
>>> ssableUtils$FutureWaitable@6737a445
>>> java.lang.IllegalStateException: waitFor: ExecutionException occurred
>>> while waiting for org.apache.giraph.utils.Progre
>>> ssableUtils$FutureWaitable@6737a445
>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>> leUtils.java:193)
>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>> ssableUtils.java:151)
>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>> ssableUtils.java:136)
>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr
>>> ogressableUtils.java:99)
>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal
>>> lables(ProgressableUtils.java:233)
>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs
>>> pServiceWorker.java:316)
>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe
>>> rviceWorker.java:409)
>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo
>>> rker.java:629)
>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa
>>> nager.java:284)
>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1693)
>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>>> Caused by: java.util.concurrent.ExecutionException:
>>> java.lang.OutOfMemoryError: Java heap space
>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202)
>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai
>>> tFor(ProgressableUtils.java:312)
>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>> leUtils.java:185)
>>> ... 16 more
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U
>>> nsafeByteArrayOutputStream.java:81)
>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c
>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161)
>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart
>>> itionCache.java:77)
>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess
>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248)
>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput
>>> Split(VertexInputSplitsCallable.java:231)
>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(
>>> InputSplitsCallable.java:267)
>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>> sCallable.java:211)
>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>> sCallable.java:60)
>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt
>>> raceCallable.java:51)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 2016-10-20 21:26:06,737 ERROR [main] org.apache.giraph.worker.BspServiceWorker:
>>> unregisterHealth: Got failure, unregistering health on
>>> /_hadoopBsp/job_1476386340018_0175/_applicationAttemptsDir/0
>>> /_superstepDir/-1/_workerHealthyDir/hadoop18.umd.com_23 on superstep -1
>>> 2016-10-20 21:26:06,746 WARN [main] org.apache.hadoop.mapred.YarnChild:
>>> Exception running child : java.lang.IllegalStateException: run: Caught
>>> an unrecoverable exception waitFor: ExecutionException occurred while
>>> waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@673
>>> 7a445
>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1693)
>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>>> Caused by: java.lang.IllegalStateException: waitFor: ExecutionException
>>> occurred while waiting for org.apache.giraph.utils.Progre
>>> ssableUtils$FutureWaitable@6737a445
>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>> leUtils.java:193)
>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>> ssableUtils.java:151)
>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>> ssableUtils.java:136)
>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr
>>> ogressableUtils.java:99)
>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal
>>> lables(ProgressableUtils.java:233)
>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs
>>> pServiceWorker.java:316)
>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe
>>> rviceWorker.java:409)
>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo
>>> rker.java:629)
>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa
>>> nager.java:284)
>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
>>> ... 7 more
>>> Caused by: java.util.concurrent.ExecutionException:
>>> java.lang.OutOfMemoryError: Java heap space
>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202)
>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai
>>> tFor(ProgressableUtils.java:312)
>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>> leUtils.java:185)
>>> ... 16 more
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U
>>> nsafeByteArrayOutputStream.java:81)
>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c
>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.java:1161)
>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart
>>> itionCache.java:77)
>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess
>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248)
>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput
>>> Split(VertexInputSplitsCallable.java:231)
>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(
>>> InputSplitsCallable.java:267)
>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>> sCallable.java:211)
>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>> sCallable.java:60)
>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt
>>> raceCallable.java:51)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1145)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> Thank you so much!
>>>
>>> Best,
>>>
>>> Hai
>>>
>>
>>
>

Mime
View raw message