giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessio Arleo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (GIRAPH-970) Missing chosen workers on superstep -1
Date Mon, 15 Dec 2014 11:47:13 GMT
Alessio Arleo created GIRAPH-970:
------------------------------------

             Summary: Missing chosen workers on superstep -1
                 Key: GIRAPH-970
                 URL: https://issues.apache.org/jira/browse/GIRAPH-970
             Project: Giraph
          Issue Type: Bug
          Components: bsp
    Affects Versions: 1.1.0
         Environment: Linux version 3.13.0-37-generic (buildd@kapok) (gcc version 4.8.2 (Ubuntu
4.8.2-19ubuntu1) 64 bit
Hadoop 1.2.1
            Reporter: Alessio Arleo


I found a problem with Giraph 1.1.0 while trying to execute the ShortestPathComputation example.


This is the command given:
$HADOOP_HOME/bin/hadoop jar  ~/git/giraph_patched/giraph-examples/target/giraph-examples-1.1.0-for-hadoop-1.2.1-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner  org.apache.giraph.examples.SimpleShortestPathsComputation
-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /users/hadoop/input/tiny_graph.txt
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /users/hadoop/output/shortestpath
-w 1

And there is the output:
#################################

Warning: $HADOOP_HOME is deprecated.

14/12/15 12:07:36 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your
InputFormat does not require one.
14/12/15 12:07:36 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your
OutputFormat does not require one.
14/12/15 12:07:36 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not
allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
14/12/15 12:07:38 INFO job.GiraphJob: Tracking URL: http://VirtualMINT-H023:50030/jobdetails.jsp?jobid=job_201412151205_0001
14/12/15 12:07:38 INFO job.GiraphJob: Waiting for resources... Job will start only when it
gets all 2 mappers
14/12/15 12:08:51 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions:
To halt after next superstep execute: 'bin/halt-application --zkServer virtualmint-h023:22181
--zkNode /_hadoopBsp/job_201412151205_0001/_haltComputation'
14/12/15 12:08:51 INFO mapred.JobClient: Running job: job_201412151205_0001
14/12/15 12:08:52 INFO mapred.JobClient:  map 100% reduce 0%

################################

The computation hangs here until the timeout is reached. Here is what I found while reading
the first worker log.

2014-12-15 12:12:16,303 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits:
Starting to write input split data to zookeeper with 1 threads
2014-12-15 12:12:16,314 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits:
Done writing input split data to zookeeper
2014-12-15 12:12:16,332 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without
authentication.
2014-12-15 12:12:16,341 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses:
Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-12-15 12:12:16,344 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount:
Creating 1, default would have been 1 partitions.
2014-12-15 12:12:16,373 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201412151211_0001/_vertexInputSplitDoneDir
2014-12-15 12:12:16,375 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Waiting on [virtualmint-h023_1]
2014-12-15 12:12:16,393 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty
without authentication.
2014-12-15 12:12:16,464 ERROR org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Missing chosen workers [Worker(hostname=virtualmint-h023, MRtaskID=1, port=30001)] on superstep
-1
2014-12-15 12:12:16,464 ERROR org.apache.giraph.master.MasterThread: masterThread: Master
algorithm failed with IllegalStateException
java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during input split
(currently not supported)
	at org.apache.giraph.master.BspServiceMaster.coordinateInputSplits(BspServiceMaster.java:1489)
	at org.apache.giraph.master.BspServiceMaster.coordinateSuperstep(BspServiceMaster.java:1656)
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:124)
2014-12-15 12:12:16,464 FATAL org.apache.giraph.graph.GraphTaskManager: uncaughtException:
OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.IllegalStateException:
coordinateVertexInputSplits: Worker failed during input split (currently not supported), exiting...
java.lang.IllegalStateException: java.lang.IllegalStateException: coordinateVertexInputSplits:
Worker failed during input split (currently not supported)
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:194)
Caused by: java.lang.IllegalStateException: coordinateVertexInputSplits: Worker failed during
input split (currently not supported)
	at org.apache.giraph.master.BspServiceMaster.coordinateInputSplits(BspServiceMaster.java:1489)
	at org.apache.giraph.master.BspServiceMaster.coordinateSuperstep(BspServiceMaster.java:1656)
	at org.apache.giraph.master.MasterThread.run(MasterThread.java:124)
2014-12-15 12:12:16,464 WARN org.apache.giraph.zk.ZooKeeperManager: logZooKeeperOutput: Dumping
up to last 100 lines of the ZooKeeper process STDOUT and STDERR.

################################

Computation does not even get to first superstep. Giraph cannot find the worker. Giraph-904
patch applied to BspServiceMaster.

I am running the Hadoop 1.2.1 on a single machine with the configuration suggested in the
Giraph Quick Start guide. Hadoop itself works fine (tested with wordcount example). 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message