giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Gonzalez <roberto.gonza...@neclab.eu>
Subject Problems to run org.apache.giraph.examples.SimpleShortestPathsComputation
Date Thu, 05 Nov 2015 13:54:40 GMT
Hi all again,

After compiling the version 1.1 I found the following bug:

https://issues.apache.org/jira/browse/GIRAPH-859

I applied the patch and disable the permissions in the HDFS (I would
want not to do that... but I can accept it).

but still executing the example as:


hadoop jar giraph-ex.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimpleShortestPathsComputation  -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip tiny_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
shortestpaths -yj giraph-ex.jar -w 1


The program runs for about 10 minutes (the example graph has 5 nodes)
before failing.

the gam-stderr.log file only contains info about SLF4J, and the 
gam-stdout.log finish with:

Container exited with a non-zero exit code 143

2015-11-05 14:25:39,340 INFO  [AMRM Callback Handler Thread] yarn.GiraphApplicationMaster
(GiraphApplicationMaster.java:onContainersCompleted(605)) - After completion of one conatiner.
current status is: completedCount :1 containersToLaunch :2 successfulCount :0 failedCount
:1
2015-11-05 14:26:13,414 INFO  [AMRM Callback Handler Thread] yarn.GiraphApplicationMaster
(GiraphApplicationMaster.java:onContainersCompleted(580)) - Got response from RM for container
ask, completedCnt=1
2015-11-05 14:26:13,414 INFO  [AMRM Callback Handler Thread] yarn.GiraphApplicationMaster
(GiraphApplicationMaster.java:onContainersCompleted(583)) - Got container status for containerID=container_1446634690791_0024_01_000003,
state=COMPLETE, exitStatus=2, diagnostics=Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:

org.apache.hadoop.util.Shell$ExitCodeException: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
	at org.apache.hadoop.util.Shell.run(Shell.java:418)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 2

2015-11-05 14:26:13,415 INFO  [AMRM Callback Handler Thread] yarn.GiraphApplicationMaster
(GiraphApplicationMaster.java:onContainersCompleted(603)) - All container compeleted. done
= true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:run(195))
- Done true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:run(207))
- Forcefully terminating executors with done =:true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:finish(221))
- Application completed. Stopping running containers
2015-11-05 14:26:13,578 INFO  [main] impl.ContainerManagementProtocolProxy (ContainerManagementProtocolProxy.java:mayBeCloseProxy(145))
- Closing proxy : computer62:59272
2015-11-05 14:26:13,579 INFO  [main] impl.ContainerManagementProtocolProxy (ContainerManagementProtocolProxy.java:mayBeCloseProxy(145))
- Closing proxy : computer66:45051
2015-11-05 14:26:13,579 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:finish(226))
- Application completed. Signalling finish to RM
2015-11-05 14:26:13,586 INFO  [main] impl.AMRMClientImpl (AMRMClientImpl.java:unregisterApplicationMaster(321))
- Waiting for application to be successfully unregistered.
2015-11-05 14:26:13,688 INFO  [main] yarn.GiraphApplicationMaster (GiraphApplicationMaster.java:main(454))
- Giraph Application Master failed. exiting
2015-11-05 14:26:13,688 INFO  [AMRM Callback Handler Thread] impl.AMRMClientAsyncImpl (AMRMClientAsyncImpl.java:run(277))
- Interrupted while waiting for queue
java.lang.InterruptedException
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
	at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)


Moreover, even when the exception is 1 minute after the program starts,
it last more than 10 minutes to finish.

Do you have any idea??

Thanks.




-- 
Dr. Roberto Gonzalez 
Research Scientist, Networked Systems and Data Analytics Group
NEC Europe Ltd.
NEC Laboratories Europe
Kurf├╝rsten-Anlage 36
 
D-69115 Heidelberg
 
phone +49 6221 4342 256
fax +49 6221 4342 155
e-mail: Roberto.Gonzalez@neclab.eu
 
NEC Europe Ltd | Registered Office: Athene, Odyssey Business Park, West End  Road, 
London, HA4 6QE, GB | Registered in England 2832014
Mime
View raw message