giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "blivme84.gmail" <blivm...@gmail.com>
Subject Giraph is waiting other worker and not working after superstep
Date Wed, 22 Jan 2014 05:41:07 GMT
I try to use Giraph on fully distributed hadoop eco system.
But when I executed SimpleShortestPathsComputation example, but master is waiting infinitely
at superstep -1.

This is my logs at master.

2014-01-21 04:19:45,255 DEBUG org.apache.giraph.bsp.BspService: process: Got a new event,
path = /_hadoopBsp/job_201401131006_0043/_vertexInputSplitDoneDir, type = NodeChild
2014-01-21 04:19:45,255 DEBUG org.apache.giraph.bsp.BspService: process: vertexInputSplitsDoneStateChanged
(worker finished sending)
2014-01-21 04:19:45,255 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Got timed signaled
of true
2014-01-21 04:19:45,256 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d1,
packet:: clientPath:null serverPath:null finished:false header:: 69
2014-01-21 04:19:45,257 DEBUG org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Got finished worker list = [CCLab33_3, CCLab35_1, CCLab39_2], size = 3, worker
2014-01-21 04:19:45,257 DEBUG org.apache.giraph.zk.ZooKeeperExt: createExt: Creating path
/_hadoopBsp/job_201401131006_0043/_vertexInputSplitsAllDone
2014-01-21 04:19:45,268 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d1,
packet:: clientPath:null serverPath:null finished:false header:: 70
2014-01-21 04:19:45,268 DEBUG org.apache.giraph.zk.ZooKeeperExt: createExt: Creating path
/_hadoopBsp/job_201401131006_0043/_applicationAttemptsDir/0/_superstepDir/-1/_worke
2014-01-21 04:19:45,276 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d1,
packet:: clientPath:null serverPath:null finished:false header:: 71
2014-01-21 04:19:45,278 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d1,
packet:: clientPath:null serverPath:null finished:false header:: 72
2014-01-21 04:19:45,278 DEBUG org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Got finished worker list = [], size = 0, worker list = [Worker(hostname=CCLab33
2014-01-21 04:19:45,278 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
0 out of 3 workers finished on superstep -1 on path /_hadoopBsp/job_201401131006
2014-01-21 04:19:45,278 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Waiting on [CCLab35_1, CCLab39_2, CCLab33_3]
2014-01-21 04:19:45,278 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Wait for 10000
2014-01-21 04:19:45,281 DEBUG org.apache.giraph.comm.netty.WrappedAdaptiveReceiveBufferSizePredictorFactory$WrappedAdaptiveReceiveBufferSizePredictor:
nextReceiveBufferSize:
2014-01-21 04:19:45,281 DEBUG org.apache.giraph.comm.netty.WrappedAdaptiveReceiveBufferSizePredictorFactory$WrappedAdaptiveReceiveBufferSizePredictor:
previousReceiveBufferS
2014-01-21 04:19:45,281 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: serverByteCounter
buffer size = 25, total bytes = 25
2014-01-21 04:19:45,283 INFO org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Server window metrics MBytes/sec sent = 0, MBytes/sec received = 0, MBytesSent = 0,
2014-01-21 04:19:45,283 DEBUG org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Client 2, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUEST, with size 21 took 23833
2014-01-21 04:19:45,283 TRACE org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Got class org.apache.giraph.comm.requests.SendAggregatorsToMasterRe
2014-01-21 04:19:45,284 DEBUG org.apache.giraph.master.MasterAggregatorHandler: acceptAggregatedValues:
Accepted one set with 0 aggregated values
2014-01-21 04:19:45,284 DEBUG org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Processing client 2, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUES
2014-01-21 04:19:45,286 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleDownstream:
serverByteCounter buffer size = 13, total bytes = 13
2014-01-21 04:19:45,291 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: serverByteCounter
buffer size = 25, total bytes = 25
2014-01-21 04:19:45,291 DEBUG org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Client 1, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUEST, with size 21 took 53916
2014-01-21 04:19:45,291 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: serverByteCounter
buffer size = 25, total bytes = 50
2014-01-21 04:19:45,291 TRACE org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Got class org.apache.giraph.comm.requests.SendAggregatorsToMasterRe
2014-01-21 04:19:45,291 DEBUG org.apache.giraph.master.MasterAggregatorHandler: acceptAggregatedValues:
Accepted one set with 0 aggregated values
2014-01-21 04:19:45,291 DEBUG org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Client 3, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUEST, with size 21 took 52829
2014-01-21 04:19:45,292 DEBUG org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Processing client 1, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUES
2014-01-21 04:19:45,292 TRACE org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Got class org.apache.giraph.comm.requests.SendAggregatorsToMasterRe
2014-01-21 04:19:45,292 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleDownstream:
serverByteCounter buffer size = 13, total bytes = 26
2014-01-21 04:19:45,292 DEBUG org.apache.giraph.master.MasterAggregatorHandler: acceptAggregatedValues:
Accepted one set with 0 aggregated values
2014-01-21 04:19:45,292 DEBUG org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Processing client 3, requestId 0, SEND_AGGREGATORS_TO_MASTER_REQUES
2014-01-21 04:19:45,292 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleDownstream:
serverByteCounter buffer size = 13, total bytes = 39
2014-01-21 04:19:45,309 DEBUG org.apache.zookeeper.ClientCnxn: Got notification sessionid:0x3434162634304d1
2014-01-21 04:19:45,309 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent state:SyncConnected
type:NodeChildrenChanged path:/_hadoopBsp/job_201401131006_0043/_applicat
2014-01-21 04:19:45,309 DEBUG org.apache.giraph.bsp.BspService: process: Got a new event,
path = /_hadoopBsp/job_201401131006_0043/_applicationAttemptsDir/0/_superstepDir/-1
2014-01-21 04:19:45,309 DEBUG org.apache.giraph.master.BspServiceMaster: processEvent: Worker
finished (node change) event - superstepStateChanged signaled
2014-01-21 04:19:45,310 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Got timed signaled
of true
2014-01-21 04:19:45,318 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d1,
packet:: clientPath:null serverPath:null finished:false header:: 73
2014-01-21 04:19:45,319 DEBUG org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList:
Got finished worker list = [cclab33_3, cclab35_1, cclab39_2], size = 3, worker
2014-01-21 04:19:45,319 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Wait for 10000
2014-01-21 04:19:45,680 DEBUG org.apache.hadoop.mapred.Counters: Creating group FileSystemCounters
with nothing
2014-01-21 04:19:45,680 DEBUG org.apache.hadoop.mapred.Counters: Adding HDFS_BYTES_READ
2014-01-21 04:19:45,680 DEBUG org.apache.hadoop.mapred.Counters: Adding FILE_BYTES_WRITTEN
2014-01-21 04:19:45,680 DEBUG org.apache.hadoop.mapred.Counters: Adding COMMITTED_HEAP_BYTES
2014-01-21 04:19:45,714 DEBUG org.apache.hadoop.util.ProcfsBasedProcessTree: [ 18302 ]
2014-01-21 04:19:45,714 DEBUG org.apache.hadoop.mapred.Counters: Adding CPU_MILLISECONDS
2014-01-21 04:19:45,714 DEBUG org.apache.hadoop.mapred.Counters: Adding PHYSICAL_MEMORY_BYTES
2014-01-21 04:19:45,714 DEBUG org.apache.hadoop.mapred.Counters: Adding VIRTUAL_MEMORY_BYTES
2014-01-21 04:19:45,714 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 sending #7
2014-01-21 04:19:45,719 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 got value #7
2014-01-21 04:19:45,719 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 5
2014-01-21 04:19:48,751 DEBUG org.apache.hadoop.util.ProcfsBasedProcessTree: [ 18302 ]
2014-01-21 04:19:48,751 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 sending #8
2014-01-21 04:19:48,753 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 got value #8
2014-01-21 04:19:48,753 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 2
2014-01-21 04:19:51,753 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 sending #9
2014-01-21 04:19:51,754 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 got value #9
2014-01-21 04:19:51,755 DEBUG org.apache.hadoop.ipc.RPC: Call: ping 2
2014-01-21 04:19:54,755 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 sending #10
2014-01-21 04:19:54,756 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:54142 from job_201401131006_0043 got value #10
2014-01-21 04:19:54,756 DEBUG org.apache.hadoop.ipc.RPC: Call: ping 1
2014-01-21 04:19:54,905 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
cclab48/10.0.0.48:8020 from root: closed
2014-01-21 04:19:54,905 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
cclab48/10.0.0.48:8020 from root: stopped, remaining connections 1
2014-01-21 04:19:55,319 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Got timed signaled
of false
2014-01-21 04:19:55,319 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Wait for 10000

And this is my worker’s log.

2014-01-21 04:19:45,270 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: clientByteCounter
buffer size = 13, total bytes = 13
2014-01-21 04:19:45,270 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: serverByteCounter
buffer size = 63, total bytes = 483
2014-01-21 04:19:45,270 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: clientByteCounter
buffer size = 13, total bytes = 26
2014-01-21 04:19:45,271 DEBUG org.apache.giraph.comm.netty.handler.ResponseClientHandler:
messageReceived: Completed (reqId=1,destAddr=CCLab33:30003,elapsedNanos=2325954,sta
2014-01-21 04:19:45,271 DEBUG org.apache.giraph.comm.netty.handler.RequestDecoder: decode:
Client 2, requestId 1, SEND_WORKER_AGGREGATORS_REQUEST, with size 59 took 56302 ns
2014-01-21 04:19:45,271 TRACE org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Got class org.apache.giraph.comm.requests.SendWorkerAggregatorsRequ
2014-01-21 04:19:45,271 DEBUG org.apache.giraph.comm.netty.handler.ResponseClientHandler:
messageReceived: Completed (reqId=1,destAddr=CCLab39:30002,elapsedNanos=2636704,sta
2014-01-21 04:19:45,279 DEBUG org.apache.giraph.comm.netty.handler.RequestServerHandler: messageReceived:
Processing client 2, requestId 1, SEND_WORKER_AGGREGATORS_REQUEST t
2014-01-21 04:19:45,279 DEBUG org.apache.giraph.comm.aggregators.OwnerAggregatorServerData:
getMyAggregatorValuesWhenReady: Values ready
2014-01-21 04:19:45,279 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleDownstream:
serverByteCounter buffer size = 13, total bytes = 78
2014-01-21 04:19:45,283 DEBUG org.apache.giraph.comm.netty.handler.RequestEncoder: encode:
Client 1, requestId 0, size = 25, SEND_AGGREGATORS_TO_MASTER_REQUEST took 27840 ns
2014-01-21 04:19:45,283 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleDownstream:
clientByteCounter buffer size = 25, total bytes = 25
2014-01-21 04:19:45,285 DEBUG org.apache.giraph.comm.netty.ByteCounter: handleUpstream: clientByteCounter
buffer size = 13, total bytes = 13
2014-01-21 04:19:45,285 DEBUG org.apache.giraph.comm.netty.handler.ResponseClientHandler:
messageReceived: Completed (reqId=0,destAddr=CCLab07:30000,elapsedNanos=2577026,sta
2014-01-21 04:19:45,286 INFO org.apache.giraph.comm.netty.NettyClient: waitAllRequests: Finished
all requests. MBytes/sec sent = 0.0048, MBytes/sec received = 0.0025, MBytes
2014-01-21 04:19:45,286 DEBUG org.apache.giraph.comm.aggregators.OwnerAggregatorServerData:
reset: Ready for next superstep
2014-01-21 04:19:45,286 DEBUG org.apache.giraph.worker.WorkerAggregatorHandler: finishSuperstep:
Aggregators finished
2014-01-21 04:19:45,287 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep: Superstep
-1, messages = 0 , message bytes = 0 , Memory (free/total/max) = 222.42M /
2014-01-21 04:19:45,298 DEBUG org.apache.giraph.zk.ZooKeeperExt: createExt: Creating path
/_hadoopBsp/job_201401131006_0043/_applicationAttemptsDir/0/_superstepDir/-1/_worke
2014-01-21 04:19:45,312 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d2,
packet:: clientPath:null serverPath:null finished:false header:: 19
2014-01-21 04:19:45,312 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep: (waiting
for rest of workers) WORKER_ONLY - Attempt=0, Superstep=-1
2014-01-21 04:19:45,314 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x3434162634304d2,
packet:: clientPath:null serverPath:null finished:false header:: 20
2014-01-21 04:19:45,314 DEBUG org.apache.giraph.zk.PredicateLock: waitMsecs: Wait for 10000
2014-01-21 04:19:47,966 DEBUG org.apache.hadoop.mapred.Counters: Creating group FileSystemCounters
with nothing
2014-01-21 04:19:47,966 DEBUG org.apache.hadoop.mapred.Counters: Adding HDFS_BYTES_READ
2014-01-21 04:19:47,966 DEBUG org.apache.hadoop.mapred.Counters: Adding FILE_BYTES_WRITTEN
2014-01-21 04:19:47,967 DEBUG org.apache.hadoop.mapred.Counters: Adding COMMITTED_HEAP_BYTES
2014-01-21 04:19:48,002 DEBUG org.apache.hadoop.util.ProcfsBasedProcessTree: [ 16391 ]
2014-01-21 04:19:48,002 DEBUG org.apache.hadoop.mapred.Counters: Adding CPU_MILLISECONDS
2014-01-21 04:19:48,003 DEBUG org.apache.hadoop.mapred.Counters: Adding PHYSICAL_MEMORY_BYTES
2014-01-21 04:19:48,003 DEBUG org.apache.hadoop.mapred.Counters: Adding VIRTUAL_MEMORY_BYTES
2014-01-21 04:19:48,003 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:55945 from job_201401131006_0043 sending #5
2014-01-21 04:19:48,006 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:55945 from job_201401131006_0043 got value #5
2014-01-21 04:19:48,007 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 4
2014-01-21 04:19:51,037 DEBUG org.apache.hadoop.util.ProcfsBasedProcessTree: [ 16391 ]
2014-01-21 04:19:51,037 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:55945 from job_201401131006_0043 sending #6
2014-01-21 04:19:51,038 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to
/127.0.0.1:55945 from job_201401131006_0043 got value #6
2014-01-21 04:19:51,039 DEBUG org.apache.hadoop.ipc.RPC: Call: statusUpdate 2

I printed log in DEBUG level, but I don’t know why.

I use 
Hadoop 1.0.4
Zookeeper 3.4.5
And this is my execution.

hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.0.2-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -D giraph.zkList="cclab48:2181,cclab47:2181,cclab46:2181”
-D giraph.logLevel=“all" org.apache.giraph.examples.SimpleShortestPathsComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip giraphtest/input/tiny_graph.txt
-vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op giraphtest/output/shortestpaths2
-w 3


Lee Taekyu
Computation & Communication Lab
School of Integrated Technology 
Yonsei University
Mime
View raw message