giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xeniad20 <xenia...@gmail.com>
Subject Giraph Error
Date Wed, 30 Jul 2014 20:25:57 GMT
Hi

I try to run on a small cluster (4 machines: 1 Master node, 1 SecondaryNameNode, 2 DataNodes
) the ShortestPath example
using giraph 1.1.0 but I get the following error in log file:

ERROR org.apache.giraph.master.BspServiceMaster:
superstepChosenWorkerAlive: Missing chosen worker
Worker(hostname=datanode2, MRtaskID=1, port=30001) on superstep 1
2014-07-14 21:50:55,106 INFO org.apache.giraph.master.MasterThread:
masterThread: Coordination of superstep 1 took 0.348 seconds ended with
state WORKER_FAILURE and is now on superstep 1
2014-07-14 21:50:55,110 ERROR org.apache.giraph.master.MasterThread:
masterThread: Master algorithm failed with ArrayIndexOutOfBoundsException
java.lang.ArrayIndexOutOfBoundsException: -1
          at
org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1284)
          at org.apache.giraph.master.MasterThread.run(MasterThread.java:148)
2014-07-14 21:50:55,114 FATAL org.apache.giraph.graph.GraphMapper:
uncaughtException: OverrideExceptionHandler on thread
org.apache.giraph.master.MasterThread, msg =
java.lang.ArrayIndexOutOfBoundsException: -1, exiting...
java.lang.IllegalStateException:
java.lang.ArrayIndexOutOfBoundsException: -1
          at org.apache.giraph.master.MasterThread.run(MasterThread.java:194)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
          at
org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1284)
          at org.apache.giraph.master.MasterThread.run(MasterThread.java:148)
2014-07-14 21:50:55,116 INFO org.apache.giraph.zk.ZooKeeperManager: run:
Shutdown hook started.
2014-07-14 21:50:55,116 WARN org.apache.giraph.zk.ZooKeeperManager:
onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper
process.
2014-07-14 21:50:55,116 INFO org.apache.giraph.zk.ZooKeeperManager:
onlineZooKeeperServers: ZooKeeper process exited with 1 (note that 143
typically means killed).

This means that an error has happened on some other node. But the error 
I see on the other DataNode is only that it cannot connect to the zookeeper.


What might be actually the cause and the solution for this error?


Thanks



Mime
View raw message