giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xenia Demetriou <xenia...@gmail.com>
Subject Re: OutOfMemoryError: Java heap space during Large graph running
Date Fri, 04 Nov 2016 21:11:52 GMT
Hi,
I have the same problem and I add the following  in
mapred-site.xml and hadoop-env.sh but I still have the same problem.
I try various values below but nothhing increase the memory.

mapred-site.xml:
<property>
    <name>mapred.child.java.opts</name>
    <value>-Xms256m </value>
    <value>-Xmx4096m</value>
</property>

hadoop-env.sh:
export HADOOP_HEAPSIZE=3072
export HADOOP_OPTS="-Xmx4096m"




2016-11-04 17:57 GMT+02:00 Agrta Rawat <agrta.rawat@gmail.com>:

> Hi,
> Didi you tried running your code on a low size data set? Did it work? And
> you have to increase xms and Xmx options in hasoop configuration file. I
> exactly do mot remember the file name but probably in mapred-site.xml you
> will be able to find such entry.
>
> :)
>
> Thanks,
> Agrta Rawat
>
> On Sun, Oct 23, 2016 at 11:17 PM, Hai Lan <lanhai1988@gmail.com> wrote:
>
>> More info:
>>
>> If I add  -Dgiraph.useOutOfCoreGraph=true it can run successfully but
>> superstep -1 is extremely slow. If I do not add Dgiraph.useOutOfCoreGraph
>> =true, it loads much faster but will show error at waiting about last 10
>> workers to finished superstep -1. The error is:
>>
>> org.apache.giraph.master.BspServiceMaster: *barrierOnWorkerList: Missing
>> chosen workers* [Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=124,
>> port=30124), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=126,
>> port=30126), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=128,
>> port=30128), Worker(hostname=trantor17.umiacs.umd.edu, MRtaskID=130,
>> port=30130)] on superstep -1
>> 2016-10-23 10:40:16,358 ERROR [org.apache.giraph.master.MasterThread]
>> org.apache.giraph.master.MasterThread: masterThread: Master algorithm
>> failed with IllegalStateException
>> java.lang.IllegalStateException: coordinateVertexInputSplits: Worker
>> failed during input split (currently not supported)
>>
>> Seems this error is just like https://issues.apache.org
>> /jira/browse/GIRAPH-904 but there is no upper case in my hostnames
>>
>> Any ideas about this?
>>
>> Many Thanks,
>>
>> Hai
>>
>>
>>
>>
>> On Sun, Oct 23, 2016 at 8:36 AM, Hai Lan <lanhai1988@gmail.com> wrote:
>>
>>> Thanks Agrta
>>>
>>> Thanks for your response. How exact I can do to increase min and max
>>> RAM?(in which conf file or by using any command/arguments? my
>>> giraph-site.xml is empty as default).
>>>
>>> As I saw online how to increase the heap size(not sure it is the same
>>> thing like you mentioned min max RAM size), many people suggest to increase:
>>> mapred.child.java.opts OR HADOOP_DATANODE_OPTS
>>>
>>> But they are not help. My problem happen during "VertexInputSplitsCallable:
>>> readVertexInputSplit:", so I tried to increase mapreduce.map.memory.mb
>>> and decrease # of container/workers. Currently I'm using 248 workers and
>>> mapreduce.map.memory.mb=12000, ratio=0.7. This can help but I face new
>>> problem:
>>>
>>> 1. The superstep -1 is extremely slow, like take 7-8 hours to load a
>>> 150G graph:
>>> e.g.
>>> org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 106 out
>>> of 248 workers finished on superstep -1 on path
>>> /_hadoopBsp/job_1477020594559_0012/_vertexInputSplitDoneDir
>>>
>>> I saw in log like:
>>> INFO [main] org.apache.giraph.comm.netty.NettyClient:
>>> logInfoAboutOpenRequests: Waiting interval of 15000 msecs, 2499 open
>>> requests, waiting for it to be <= 0, MBytes/sec received = 0.0001,
>>> MBytesReceived = 0.0058, ave received req MBytes = 0, secs waited = 92.12
>>> MBytes/sec sent = 10.4373, MBytesSent = 961.4983, ave sent req MBytes =
>>> 0.3244, secs waited = 92.12
>>>
>>> To finish those 2499 open requests will take a very long time. *I'm not
>>> sure is this normal?*
>>>
>>> 2. I tried out-of-core graph option but I'm not sure I'm using it
>>> correct. I did add -Dgiraph.useOutOfCoreGraph=true
>>> -ca isStaticGraph=true,giraph.maxPartitionsInMemory=10. But how I know
>>> if it is work?
>>>
>>> I doubt when I tried 15T graph, the problem will be worse. What should I
>>> do?
>>>
>>> Thanks for your help.
>>>
>>> Best,
>>> Hai
>>>
>>>
>>> On Sun, Oct 23, 2016 at 7:11 AM, Agrta Rawat <agrta.rawat@gmail.com>
>>> wrote:
>>>
>>>> Hi Hai,
>>>>
>>>> Please check your giraph configurations. Try increasing min and max RAM
>>>> size in your configurations.
>>>> This should help.
>>>>
>>>> Regards,
>>>> Agrta Rawat
>>>>
>>>>
>>>> On Sat, Oct 22, 2016 at 7:46 PM, Hai Lan <lanhai1988@gmail.com> wrote:
>>>>
>>>>> Can anyone help with this?
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>>
>>>>> On Thu, Oct 20, 2016 at 9:48 PM, Hai Lan <lanhai1988@gmail.com>
wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> I'm facing a problem when I run large graph job (currently 1.6T,
will
>>>>>> be 16T then), it always shows java.lang.OutOfMemoryError: Java heap
>>>>>> space error when loaded specific numbers of vertex(near 59000000).
I tried
>>>>>> to add like:
>>>>>> -Dgiraph.useOutOfCoreGraph=true
>>>>>>  -Dmapred.child.java.opts="-XX:-UseGCOverheadLimit" OR
>>>>>> -Dmapred.child.java.opts="-Xmx16384"
>>>>>>  -Dgiraph.yarn.task.heap.mb=36570
>>>>>>
>>>>>> but the problem remain though I can see those value are shown in
>>>>>> Metadata.
>>>>>>
>>>>>> I'm not sure the max value of memory in this
>>>>>> VertexInputSplitsCallable info is related to java heap size.
>>>>>> INFO [load-0] org.apache.giraph.worker.VertexInputSplitsCallable:
>>>>>> readVertexInputSplit: Loaded 46975802 vertices at 68977.49310291892
>>>>>> vertices/sec 0 edges at 0.0 edges/sec Memory (free/total/max) = 475.08M
/
>>>>>> 2759.00M / 2759.00M
>>>>>>
>>>>>> But I am noticed in main log, it *always* shows:
>>>>>> INFO [AsyncDispatcher event handler] org.apache.hadoop.mapred.JobConf:
>>>>>> Task java-opts do not specify heap size. Setting task attempt jvm
max heap
>>>>>> size to -Xmx2868m
>>>>>> *no matter what arguments I added*. Even when I run normal Hadoop
>>>>>> jobs.
>>>>>>
>>>>>> Any ideas about this? Following is the log.
>>>>>>
>>>>>> 2016-10-20 21:25:49,008 ERROR [netty-client-worker-2]
>>>>>> org.apache.giraph.comm.netty.NettyClient: Request failed
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB
>>>>>> uf.java:45)
>>>>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo
>>>>>> ledByteBufAllocator.java:43)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>>>>> ByteBufAllocator.java:136)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>>>>> ByteBufAllocator.java:127)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte
>>>>>> BufAllocator.java:85)
>>>>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re
>>>>>> questEncoder.java:81)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De
>>>>>> faultChannelHandlerContext.java:645)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De
>>>>>> faultChannelHandlerContext.java:29)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(
>>>>>> DefaultChannelHandlerContext.java:906)
>>>>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve
>>>>>> ntExecutor.java:36)
>>>>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin
>>>>>> gleThreadEventExecutor.java:101)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>> 2016-10-20 21:25:55,299 ERROR [netty-client-worker-1]
>>>>>> org.apache.giraph.comm.netty.NettyClient: Request failed
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at io.netty.buffer.UnpooledHeapByteBuf.<init>(UnpooledHeapByteB
>>>>>> uf.java:45)
>>>>>> at io.netty.buffer.UnpooledByteBufAllocator.newHeapBuffer(Unpoo
>>>>>> ledByteBufAllocator.java:43)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>>>>> ByteBufAllocator.java:136)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.heapBuffer(Abstract
>>>>>> ByteBufAllocator.java:127)
>>>>>> at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByte
>>>>>> BufAllocator.java:85)
>>>>>> at org.apache.giraph.comm.netty.handler.RequestEncoder.write(Re
>>>>>> questEncoder.java:81)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext.invokeWrite(De
>>>>>> faultChannelHandlerContext.java:645)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext.access$2000(De
>>>>>> faultChannelHandlerContext.java:29)
>>>>>> at io.netty.channel.DefaultChannelHandlerContext$WriteTask.run(
>>>>>> DefaultChannelHandlerContext.java:906)
>>>>>> at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEve
>>>>>> ntExecutor.java:36)
>>>>>> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin
>>>>>> gleThreadEventExecutor.java:101)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>> 2016-10-20 21:26:06,731 ERROR [main] org.apache.giraph.graph.GraphMapper:
>>>>>> Caught an unrecoverable exception waitFor: ExecutionException occurred
>>>>>> while waiting for org.apache.giraph.utils.Progre
>>>>>> ssableUtils$FutureWaitable@6737a445
>>>>>> java.lang.IllegalStateException: waitFor: ExecutionException
>>>>>> occurred while waiting for org.apache.giraph.utils.Progre
>>>>>> ssableUtils$FutureWaitable@6737a445
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>>>>> leUtils.java:193)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>>>>> ssableUtils.java:151)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>>>>> ssableUtils.java:136)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr
>>>>>> ogressableUtils.java:99)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal
>>>>>> lables(ProgressableUtils.java:233)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs
>>>>>> pServiceWorker.java:316)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe
>>>>>> rviceWorker.java:409)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo
>>>>>> rker.java:629)
>>>>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa
>>>>>> nager.java:284)
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
>>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>>> upInformation.java:1693)
>>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>>>>>> Caused by: java.util.concurrent.ExecutionException:
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202)
>>>>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai
>>>>>> tFor(ProgressableUtils.java:312)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>>>>> leUtils.java:185)
>>>>>> ... 16 more
>>>>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U
>>>>>> nsafeByteArrayOutputStream.java:81)
>>>>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c
>>>>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.
>>>>>> java:1161)
>>>>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart
>>>>>> itionCache.java:77)
>>>>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess
>>>>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248)
>>>>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput
>>>>>> Split(VertexInputSplitsCallable.java:231)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(
>>>>>> InputSplitsCallable.java:267)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>>>>> sCallable.java:211)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>>>>> sCallable.java:60)
>>>>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt
>>>>>> raceCallable.java:51)
>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>>> Executor.java:1145)
>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>>> lExecutor.java:615)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>> 2016-10-20 21:26:06,737 ERROR [main] org.apache.giraph.worker.BspServiceWorker:
>>>>>> unregisterHealth: Got failure, unregistering health on
>>>>>> /_hadoopBsp/job_1476386340018_0175/_applicationAttemptsDir/0
>>>>>> /_superstepDir/-1/_workerHealthyDir/hadoop18.umd.com_23 on superstep
>>>>>> -1
>>>>>> 2016-10-20 21:26:06,746 WARN [main] org.apache.hadoop.mapred.YarnChild:
>>>>>> Exception running child : java.lang.IllegalStateException: run:
>>>>>> Caught an unrecoverable exception waitFor: ExecutionException occurred
>>>>>> while waiting for org.apache.giraph.utils.Progre
>>>>>> ssableUtils$FutureWaitable@6737a445
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:104)
>>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>>> upInformation.java:1693)
>>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>>>>>> Caused by: java.lang.IllegalStateException: waitFor:
>>>>>> ExecutionException occurred while waiting for org.apache.giraph.utils.Progre
>>>>>> ssableUtils$FutureWaitable@6737a445
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>>>>> leUtils.java:193)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>>>>> ssableUtils.java:151)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitForever(Progre
>>>>>> ssableUtils.java:136)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Pr
>>>>>> ogressableUtils.java:99)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCal
>>>>>> lables(ProgressableUtils.java:233)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(Bs
>>>>>> pServiceWorker.java:316)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspSe
>>>>>> rviceWorker.java:409)
>>>>>> at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWo
>>>>>> rker.java:629)
>>>>>> at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskMa
>>>>>> nager.java:284)
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
>>>>>> ... 7 more
>>>>>> Caused by: java.util.concurrent.ExecutionException:
>>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>>>>> at java.util.concurrent.FutureTask.get(FutureTask.java:202)
>>>>>> at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.wai
>>>>>> tFor(ProgressableUtils.java:312)
>>>>>> at org.apache.giraph.utils.ProgressableUtils.waitFor(Progressab
>>>>>> leUtils.java:185)
>>>>>> ... 16 more
>>>>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>>>> at org.apache.giraph.utils.UnsafeByteArrayOutputStream.<init>(U
>>>>>> nsafeByteArrayOutputStream.java:81)
>>>>>> at org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.c
>>>>>> reateExtendedDataOutput(ImmutableClassesGiraphConfiguration.
>>>>>> java:1161)
>>>>>> at org.apache.giraph.comm.SendPartitionCache.addVertex(SendPart
>>>>>> itionCache.java:77)
>>>>>> at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcess
>>>>>> or.sendVertexRequest(NettyWorkerClientRequestProcessor.java:248)
>>>>>> at org.apache.giraph.worker.VertexInputSplitsCallable.readInput
>>>>>> Split(VertexInputSplitsCallable.java:231)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(
>>>>>> InputSplitsCallable.java:267)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>>>>> sCallable.java:211)
>>>>>> at org.apache.giraph.worker.InputSplitsCallable.call(InputSplit
>>>>>> sCallable.java:60)
>>>>>> at org.apache.giraph.utils.LogStacktraceCallable.call(LogStackt
>>>>>> raceCallable.java:51)
>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>>> Executor.java:1145)
>>>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>>> lExecutor.java:615)
>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>
>>>>>>
>>>>>> Thank you so much!
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Hai
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message