giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xeniad20 <xenia...@gmail.com>
Subject worker can not connect to Zookeeper
Date Mon, 14 Jul 2014 19:34:38 GMT
Hi

I try to run on a small cluster (4 machines) the ShortestPath example 
using giraph 1.1.0 but I get the following error in log file:


2014-07-14 21:50:51,134 INFO org.apache.hadoop.util.NativeCodeLoader: 
Loaded the native-hadoop library
2014-07-14 21:50:51,283 WARN 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi 
already exists!
2014-07-14 21:50:51,389 INFO org.apache.hadoop.util.ProcessTree: setsid 
exited with exit code 0
2014-07-14 21:50:51,396 INFO org.apache.hadoop.mapred.Task:  Using 
ResourceCalculatorPlugin : 
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3fd95bb5
2014-07-14 21:50:51,489 INFO org.apache.hadoop.mapred.MapTask: 
Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1
2014-07-14 21:50:51,512 INFO org.apache.giraph.graph.GraphTaskManager: 
setup: Log level remains at info
2014-07-14 21:50:51,566 INFO org.apache.giraph.zk.ZooKeeperManager: 
createCandidateStamp: Made the directory 
_bsp/_defaultZkManagerDir/job_201407142149_0001
2014-07-14 21:50:51,578 INFO org.apache.giraph.zk.ZooKeeperManager: 
createCandidateStamp: Made the directory 
_bsp/_defaultZkManagerDir/job_201407142149_0001/_zkServer
2014-07-14 21:50:51,585 INFO org.apache.giraph.zk.ZooKeeperManager: 
createCandidateStamp: Creating my filestamp 
_bsp/_defaultZkManagerDir/job_201407142149_0001/_task/datanode1 0
2014-07-14 21:50:51,689 INFO org.apache.giraph.zk.ZooKeeperManager: 
getZooKeeperServerList: Got [datanode1] 1 hosts from 1 candidates when 1 
required (polling period is 3000) on attempt 0
2014-07-14 21:50:51,689 INFO org.apache.giraph.zk.ZooKeeperManager: 
createZooKeeperServerList: Creating the final ZooKeeper file 
'_bsp/_defaultZkManagerDir/job_201407142149_0001/zkServerList_datanode1 0 '
2014-07-14 21:50:51,742 INFO org.apache.giraph.zk.ZooKeeperManager: 
getZooKeeperServerList: For task 0, got file 'zkServerList_datanode1 0 ' 
(polling period is 3000)
2014-07-14 21:50:51,742 INFO org.apache.giraph.zk.ZooKeeperManager: 
getZooKeeperServerList: Found [datanode1, 0] 2 hosts in filename 
'zkServerList_datanode1 0 '
2014-07-14 21:50:51,744 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Trying to delete old directory 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/work/_bspZooKeeper
2014-07-14 21:50:51,750 INFO org.apache.giraph.zk.ZooKeeperManager: 
generateZooKeeperConfigFile: Creating file 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/work/_bspZooKeeper/zoo.cfg

in 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/work/_bspZooKeeper

with base port 22181
2014-07-14 21:50:51,750 INFO org.apache.giraph.zk.ZooKeeperManager: 
generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
2014-07-14 21:50:51,750 INFO org.apache.giraph.zk.ZooKeeperManager: 
generateZooKeeperConfigFile: Delete of zoo.cfg = false
2014-07-14 21:50:51,751 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Attempting to start ZooKeeper server with 
command [/usr/lib/jvm/jdk1.7.0_25/jre/bin/java, -cp, 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/jars/classes:/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/jars:/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/attempt_201407142149_0001_m_000000_0/work:/usr/local/hadoop-1.2.1/libexec/../conf:/usr/lib/jvm/jdk1.7.0_25/lib/tools.jar:/usr/local/hadoop-1.2.1/libexec/..:/usr/local/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/usr/local/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-collections-3.2.1.jar:/usr/local/hadoop-1.
2.1/libexec/../lib/commons-configuration-1.6.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hsqldb-1.8.0.10.jar:/u
sr/local/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/usr/local/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/usr/local/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/usr/local/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/usr/local/hadoop-1.2.1/libexec/../lib/oro-2.0.8.jar:/
usr/local/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/usr/local/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/usr/local/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/usr/local/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar,

-Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, 
-XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, 
org.apache.zookeeper.server.quorum.QuorumPeerMain, 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/work/_bspZooKeeper/zoo.cfg]

in directory 
/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/work/_bspZooKeeper
2014-07-14 21:50:51,752 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Shutdown hook added.
2014-07-14 21:50:51,752 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to 
datanode1:22181 with poll msecs = 3000
2014-07-14 21:50:51,761 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Connected to DataNode1/10.190.12.34:22181 
<http://10.190.12.34:22181/>!
2014-07-14 21:50:51,761 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Creating my filestamp 
_bsp/_defaultZkManagerDir/job_201407142149_0001/_zkServer/datanode1 0
2014-07-14 21:50:51,788 INFO org.apache.giraph.graph.GraphTaskManager: 
setup: Chosen to run ZooKeeper...
2014-07-14 21:50:51,788 INFO org.apache.giraph.graph.GraphTaskManager: 
setup: Starting up BspServiceMaster (master thread)...
2014-07-14 21:50:51,795 INFO org.apache.giraph.bsp.BspService: 
BspService: Path to create to halt is 
/_hadoopBsp/job_201407142149_0001/_haltComputation
2014-07-14 21:50:51,795 INFO org.apache.giraph.bsp.BspService: 
BspService: Connecting to ZooKeeper with job job_201407142149_0001, 0 on 
datanode1:22181
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:host.name <http://host.name/>=DataNode1
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.version=1.7.0_25
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.vendor=Oracle Corporation
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.home=/usr/lib/jvm/jdk1.7.0_25/jre
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.class.path=/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/jars/classes:/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/jars:/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/attempt_201407142149_0001_m_000000_0/work:/usr/local/hadoop-1.2.1/libexec/../conf:/usr/lib/jvm/jdk1.7.0_25/lib/tools.jar:/usr/local/hadoop-1.2.1/libexec/..:/usr/local/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/usr/local/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-collections-3.
2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-configuration-1.6.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/usr/local/hadoop-1.2.1/libexec/.
./lib/hsqldb-1.8.0.10.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/usr/local/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/usr/local/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/usr/local/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/usr/local/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/usr/local/hadoop-1.2.1/li
bexec/../lib/oro-2.0.8.jar:/usr/local/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/usr/local/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/usr/local/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/usr/local/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/usr/local/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.library.path=/usr/local/hadoop-1.2.1/libexec/../lib/native/Linux-amd64-64:/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/attempt_201407142149_0001_m_000000_0/work
2014-07-14 21:50:51,799 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.io.tmpdir=/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/attempt_201407142149_0001_m_000000_0/work/tmp
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:java.compiler=<NA>
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:os.name <http://os.name/>=Linux
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:os.arch=amd64
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:os.version=3.11.0-17-generic
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:user.name <http://user.name/>=hduser
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:user.home=/home/hduser
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Client 
environment:user.dir=/tmp/hadoop-hduser/mapred/local/taskTracker/hduser/jobcache/job_201407142149_0001/attempt_201407142149_0001_m_000000_0/work
2014-07-14 21:50:51,800 INFO org.apache.zookeeper.ZooKeeper: Initiating 
client connection, connectString=datanode1:22181 sessionTimeout=60000 
watcher=org.apache.giraph.master.BspServiceMaster@255434d8
2014-07-14 21:50:51,810 INFO org.apache.zookeeper.ClientCnxn: Opening 
socket connection to server DataNode1/10.190.12.34:22181 
<http://10.190.12.34:22181/>. Will not attempt to authenticate using 
SASL (unknown error)
2014-07-14 21:50:51,810 INFO org.apache.zookeeper.ClientCnxn: Socket 
connection established to DataNode1/10.190.12.34:22181 
<http://10.190.12.34:22181/>, initiating session
2014-07-14 21:50:51,888 INFO org.apache.zookeeper.ClientCnxn: Session 
establishment complete on server DataNode1/10.190.12.34:22181 
<http://10.190.12.34:22181/>, sessionid = 0x1473633b1bd0000, negotiated 
timeout = 40000
2014-07-14 21:50:51,889 INFO org.apache.giraph.bsp.BspService: process: 
Asynchronous connection complete.
2014-07-14 21:50:51,892 INFO org.apache.giraph.graph.GraphTaskManager: 
map: No need to do anything when not a worker
2014-07-14 21:50:51,892 INFO org.apache.giraph.graph.GraphTaskManager: 
cleanup: Starting for MASTER_ZOOKEEPER_ONLY
2014-07-14 21:50:51,997 INFO org.apache.giraph.master.BspServiceMaster: 
becomeMaster: First child is 
'/_hadoopBsp/job_201407142149_0001/_masterElectionDir/datanode1_00000000000' 
and my bid is 
'/_hadoopBsp/job_201407142149_0001/_masterElectionDir/datanode1_00000000000'
2014-07-14 21:50:52,073 INFO org.apache.giraph.comm.netty.NettyServer: 
NettyServer: Using execution group with 8 threads for requestFrameDecoder.
2014-07-14 21:50:52,110 INFO org.apache.giraph.comm.netty.NettyServer: 
start: Started server communication server: DataNode1/10.190.12.34:30000 
<http://10.190.12.34:30000/>with up to 16 threads on bind attempt 0 with 
sendBufferSize = 32768 receiveBufferSize = 524288
2014-07-14 21:50:52,113 INFO org.apache.giraph.comm.netty.NettyClient: 
NettyClient: Using execution handler with 8 threads after request-encoder.
2014-07-14 21:50:52,115 INFO org.apache.giraph.master.BspServiceMaster: 
becomeMaster: I am now the master!
2014-07-14 21:50:52,179 INFO org.apache.giraph.bsp.BspService: process: 
applicationAttemptChanged signaled
2014-07-14 21:50:52,280 WARN org.apache.giraph.bsp.BspService: process: 
Unknown and unprocessed event 
(path=/_hadoopBsp/job_201407142149_0001/_applicationAttemptsDir/0/_superstepDir, 
type=NodeChildrenChanged, state=SyncConnected)
2014-07-14 21:50:53,930 INFO 
org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to 
process : 1
2014-07-14 21:50:53,939 WARN 
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library 
not loaded
2014-07-14 21:50:53,940 INFO org.apache.giraph.master.BspServiceMaster: 
generateVertexInputSplits: Got 1 input splits for 1 input threads
2014-07-14 21:50:53,940 INFO org.apache.giraph.master.BspServiceMaster: 
createVertexInputSplits: Starting to write input split data to zookeeper 
with 1 threads
2014-07-14 21:50:54,003 INFO org.apache.giraph.master.BspServiceMaster: 
createVertexInputSplits: Done writing input split data to zookeeper
2014-07-14 21:50:54,052 INFO org.apache.giraph.comm.netty.NettyClient: 
Using Netty without authentication.
2014-07-14 21:50:54,056 INFO org.apache.giraph.comm.netty.NettyClient: 
connectAllAddresses: Successfully added 1 connections, (1 total 
connected) 0 failed, 0 failures total.
2014-07-14 21:50:54,057 INFO org.apache.giraph.partition.PartitionUtils: 
computePartitionCount: Creating 1, default would have been 1 partitions.
2014-07-14 21:50:54,150 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path 
/_hadoopBsp/job_201407142149_0001/_vertexInputSplitDoneDir
2014-07-14 21:50:54,151 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: Waiting on [datanode2_1]
2014-07-14 21:50:54,156 INFO org.apache.giraph.comm.netty.NettyServer: 
start: Using Netty without authentication.
2014-07-14 21:50:54,306 INFO 
org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server 
window metrics MBytes/sec received = 0, MBytesReceived = 0, ave received 
req MBytes = 0, secs waited = 1.40536384E9
2014-07-14 21:50:54,327 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: 1 out of 1 workers finished on superstep -1 on path 
/_hadoopBsp/job_201407142149_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerFinishedDir
2014-07-14 21:50:54,327 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: Waiting on []
2014-07-14 21:50:54,333 INFO org.apache.giraph.master.BspServiceMaster: 
aggregateWorkerStats: Aggregation found 
(vtx=5,finVtx=0,edges=12,msgCount=0,msgBytesCount=0,haltComputation=false) 
on superstep = -1
2014-07-14 21:50:54,348 INFO org.apache.giraph.master.MasterThread: 
masterThread: Coordination of superstep -1 took 0.332 seconds ended with 
state THIS_SUPERSTEP_DONE and is now on superstep 0
2014-07-14 21:50:54,603 INFO org.apache.giraph.comm.netty.NettyClient: 
connectAllAddresses: Successfully added 0 connections, (0 total 
connected) 0 failed, 0 failures total.
2014-07-14 21:50:54,604 INFO 
org.apache.giraph.partition.PartitionBalancer: 
balancePartitionsAcrossWorkers: Using algorithm static
2014-07-14 21:50:54,604 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: [Worker(hostname=datanode2, MRtaskID=1, 
port=30001):(v=5, e=12),]
2014-07-14 21:50:54,605 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: Vertices - Mean: 5, Min: 
Worker(hostname=datanode2, MRtaskID=1, port=30001) - 5, Max: 
Worker(hostname=datanode2, MRtaskID=1, port=30001) - 5
2014-07-14 21:50:54,605 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: Edges - Mean: 12, Min: Worker(hostname=datanode2, 
MRtaskID=1, port=30001) - 12, Max: Worker(hostname=datanode2, 
MRtaskID=1, port=30001) - 12
2014-07-14 21:50:54,736 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: 1 out of 1 workers finished on superstep 0 on path 
/_hadoopBsp/job_201407142149_0001/_applicationAttemptsDir/0/_superstepDir/0/_workerFinishedDir
2014-07-14 21:50:54,736 INFO org.apache.giraph.master.BspServiceMaster: 
barrierOnWorkerList: Waiting on []
2014-07-14 21:50:54,740 INFO org.apache.giraph.master.BspServiceMaster: 
aggregateWorkerStats: Aggregation found 
(vtx=5,finVtx=5,edges=12,msgCount=3,msgBytesCount=73,haltComputation=false) 
on superstep = 0
2014-07-14 21:50:54,758 INFO org.apache.giraph.master.MasterThread: 
masterThread: Coordination of superstep 0 took 0.41 seconds ended with 
state THIS_SUPERSTEP_DONE and is now on superstep 1
2014-07-14 21:50:55,013 INFO org.apache.giraph.comm.netty.NettyClient: 
connectAllAddresses: Successfully added 0 connections, (0 total 
connected) 0 failed, 0 failures total.
2014-07-14 21:50:55,013 INFO 
org.apache.giraph.partition.PartitionBalancer: 
balancePartitionsAcrossWorkers: Using algorithm static
2014-07-14 21:50:55,013 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: [Worker(hostname=datanode2, MRtaskID=1, 
port=30001):(v=5, e=12),]
2014-07-14 21:50:55,013 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: Vertices - Mean: 5, Min: 
Worker(hostname=datanode2, MRtaskID=1, port=30001) - 5, Max: 
Worker(hostname=datanode2, MRtaskID=1, port=30001) - 5
2014-07-14 21:50:55,013 INFO org.apache.giraph.partition.PartitionUtils: 
analyzePartitionStats: Edges - Mean: 12, Min: Worker(hostname=datanode2, 
MRtaskID=1, port=30001) - 12, Max: Worker(hostname=datanode2, 
MRtaskID=1, port=30001) - 12
2014-07-14 21:50:55,106 ERROR org.apache.giraph.master.BspServiceMaster: 
superstepChosenWorkerAlive: Missing chosen worker 
Worker(hostname=datanode2, MRtaskID=1, port=30001) on superstep 1
2014-07-14 21:50:55,106 INFO org.apache.giraph.master.MasterThread: 
masterThread: Coordination of superstep 1 took 0.348 seconds ended with 
state WORKER_FAILURE and is now on superstep 1
2014-07-14 21:50:55,110 ERROR org.apache.giraph.master.MasterThread: 
masterThread: Master algorithm failed with ArrayIndexOutOfBoundsException
java.lang.ArrayIndexOutOfBoundsException: -1
         at 
org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1284)
         at org.apache.giraph.master.MasterThread.run(MasterThread.java:148)
2014-07-14 21:50:55,114 FATAL org.apache.giraph.graph.GraphMapper: 
uncaughtException: OverrideExceptionHandler on thread 
org.apache.giraph.master.MasterThread, msg = 
java.lang.ArrayIndexOutOfBoundsException: -1, exiting...
java.lang.IllegalStateException: 
java.lang.ArrayIndexOutOfBoundsException: -1
         at org.apache.giraph.master.MasterThread.run(MasterThread.java:194)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
         at 
org.apache.giraph.master.BspServiceMaster.getLastGoodCheckpoint(BspServiceMaster.java:1284)
         at org.apache.giraph.master.MasterThread.run(MasterThread.java:148)
2014-07-14 21:50:55,116 INFO org.apache.giraph.zk.ZooKeeperManager: run: 
Shutdown hook started.
2014-07-14 21:50:55,116 WARN org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper 
process.
2014-07-14 21:50:55,116 INFO org.apache.giraph.zk.ZooKeeperManager: 
onlineZooKeeperServers: ZooKeeper process exited with 1 (note that 143 
typically means killed).

Also this is my zoo.cfg file for the cluster machines:

# The number of milliseconds of each tick
tickTime=2000

# The number of ticks that the initial
# synchronization phase can take
initLimit=10

# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/usr/local/zookeeper/zookeeper-3.4.5
# the port at which the clients will connect
clientPort=22181
#
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
#
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=1




Thanks

Mime
View raw message