incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jianfeng Qian (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-169) How to close all child when a job finished?
Date Wed, 28 Mar 2012 06:02:40 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240216#comment-13240216
] 

Jianfeng Qian commented on GIRAPH-169:
--------------------------------------

The master mapper's log
2012-03-28 10:27:16,286 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
2012-03-28 10:27:16,551 WARN org.apache.giraph.bsp.BspOutputFormat: getOutputCommitter: Returning
ImmutableOutputCommiter (does nothing).
2012-03-28 10:27:16,562 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code
0
2012-03-28 10:27:16,570 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin
: org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4bf54c5f
2012-03-28 10:27:16,662 INFO org.apache.giraph.graph.GraphMapper: Distributed cache is empty.
Assuming fatjar.
2012-03-28 10:27:16,662 INFO org.apache.giraph.graph.GraphMapper: setup: classpath @ /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar
2012-03-28 10:27:16,678 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp:
Made the directory _bsp/_defaultZkManagerDir/job_201203281017_0001
2012-03-28 10:27:16,681 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp:
Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/tmm-e10 0
2012-03-28 10:27:17,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList:
Got [tmm-e10] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt
0
2012-03-28 10:27:17,491 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList:
Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201203281017_0001/zkServerList_tmm-e10
0 '
2012-03-28 10:27:17,509 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList:
For task 0, got file 'zkServerList_tmm-e10 0 ' (polling period is 3000)
2012-03-28 10:27:17,509 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList:
Found [tmm-e10, 0] 2 hosts in filename 'zkServerList_tmm-e10 0 '
2012-03-28 10:27:17,510 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Trying to delete old directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper
2012-03-28 10:27:17,512 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile:
Creating file /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper/zoo.cfg
in /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper
with base port 22181
2012-03-28 10:27:17,513 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile:
Make directory of _bspZooKeeper = true
2012-03-28 10:27:17,513 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile:
Delete of zoo.cfg = false
2012-03-28 10:27:17,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Attempting to start ZooKeeper server with command [/usr/local/java/jdk1.6.0_22/jre/bin/java,
-Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70,
-XX:MaxGCPauseMillis=100, -cp, /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar,
org.apache.zookeeper.server.quorum.QuorumPeerMain, /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper/zoo.cfg]
in directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper
2012-03-28 10:27:17,522 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Connect attempt 0 of 10 max trying to connect to tmm-e10:22181 with poll msecs = 3000
2012-03-28 10:27:17,532 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Got ConnectException
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:529)
        at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:661)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:425)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:646)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Connect attempt 1 of 10 max trying to connect to tmm-e10:22181 with poll msecs = 3000
2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Connected to tmm-e10/2.1.1.130:22181!
2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_zkServer/tmm-e10 0
2012-03-28 10:27:20,550 INFO org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceMaster
(master thread)...
2012-03-28 10:27:20,562 INFO org.apache.giraph.graph.BspService: BspService: Connecting to
ZooKeeper with job job_201203281017_0001, 0 on tmm-e10:22181
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969,
built on 02/23/2011 22:27 GMT
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=tmm-e10
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_22
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/local/java/jdk1.6.0_22/jre
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/classes:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../conf:/usr/local/java/jdk1.6.0_22/lib/tools.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-server-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2012-03-28
10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/local/test-0302/hadoop-0.20.205.0/libexec/../lib:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work2012-03-28
10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work/tmp
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32.12-0.7-default
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=root
2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/root2012-03-28
10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work
2012-03-28 10:27:20,572 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection,
connectString=tmm-e10:22181 sessionTimeout=60000 watcher=org.apache.giraph.graph.BspServiceMaster@68a53de4
2012-03-28 10:27:20,584 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to
server tmm-e10/2.1.1.130:22181
2012-03-28 10:27:20,586 INFO org.apache.zookeeper.ClientCnxn: Socket connection established
to tmm-e10/2.1.1.130:22181, initiating session
2012-03-28 10:27:21,415 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete
on server tmm-e10/2.1.1.130:22181, sessionid = 0x1365720ee4e0000, negotiated timeout = 300000
2012-03-28 10:27:21,417 INFO org.apache.giraph.graph.BspService: process: Asynchronous connection
complete.
2012-03-28 10:27:21,420 INFO org.apache.giraph.graph.GraphMapper: map: No need to do anything
when not a worker
2012-03-28 10:27:21,420 INFO org.apache.giraph.graph.GraphMapper: cleanup: Starting for MASTER_ZOOKEEPER_ONLY2012-03-28
10:27:21,482 INFO org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201203281017_0001/_masterElectionDir/tmm-e10_00000000000'
and my bid is '/_hadoopBsp/job_201203281017_0001/_masterElectionDir/tmm-e10_00000000000'
2012-03-28 10:27:21,489 INFO org.apache.giraph.graph.BspServiceMaster: becomeMaster: I am
now the master!
2012-03-28 10:27:21,526 INFO org.apache.giraph.graph.BspService: process: applicationAttemptChanged
signaled2012-03-28 10:27:21,579 WARN org.apache.giraph.graph.BspService: process: Unknown
and unprocessed event (path=/_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir,
type=NodeChildrenChanged, state=SyncConnected)
2012-03-28 10:27:41,254 INFO org.apache.giraph.graph.BspServiceMaster: generateInputSplits:
Got 64 input splits for 64 workers2012-03-28 10:27:42,265 INFO org.apache.giraph.graph.partition.HashMasterPartitioner:
createInitialPartitionOwners: Creating 4096, default would have been 4096 partitions.2012-03-28
10:27:42,265 WARN org.apache.giraph.graph.partition.HashMasterPartitioner: createInitialPartitionOwners:
Reducing the partitionCount to 2995 from 40962012-03-28 10:27:42,372 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep -1 on path /_hadoopBsp/job_201203281017_0001/_inputSplitDoneDir2012-03-28
10:27:52,440 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of
64 workers finished on superstep -1 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerFinishedDir2012-03-28
10:27:52,827 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation
found (vtx=1000000,finVtx=0,edges=16000000,msgCount=0) on superstep = -12012-03-28 10:27:52,859
INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep -1 took
10.669 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 02012-03-28 10:27:53,171
INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static2012-03-28 10:27:53,178 INFO org.apache.giraph.graph.partition.PartitionUtils:
analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59,
port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28
10:27:53,178 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max:
Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:27:53,252 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 0 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/0/_workerFinishedDir2012-03-28
10:27:58,718 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation
found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 02012-03-28
10:27:59,032 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep
0 took 6.173 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 12012-03-28
10:27:59,326 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static2012-03-28 10:27:59,330 INFO org.apache.giraph.graph.partition.PartitionUtils:
analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59,
port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28
10:27:59,330 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max:
Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:27:59,467 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 1 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/1/_workerFinishedDir2012-03-28
10:28:02,994 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation
found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 1
2012-03-28 10:28:03,004 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep:
Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/02012-03-28
10:28:04,749 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep
1 took 5.717 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 2
2012-03-28 10:28:04,787 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static2012-03-28 10:28:04,790 INFO org.apache.giraph.graph.partition.PartitionUtils:
analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59,
port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28
10:28:04,790 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max:
Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:04,863 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 2 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/2/_workerFinishedDir
2012-03-28 10:28:08,324 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats:
Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep =
2
2012-03-28 10:28:08,335 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep:
Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/12012-03-28
10:28:10,015 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep
2 took 5.265 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 3
2012-03-28 10:28:10,049 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static2012-03-28 10:28:10,049 INFO org.apache.giraph.graph.partition.PartitionUtils:
analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59,
port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28
10:28:10,050 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max:
Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:10,129 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 3 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/3/_workerFinishedDir
2012-03-28 10:28:13,320 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats:
Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep =
32012-03-28 10:28:13,331 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep:
Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/2
2012-03-28 10:28:15,083 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination
of superstep 3 took 5.068 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep
4
2012-03-28 10:28:15,120 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static
2012-03-28 10:28:15,121 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359,
Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:28:15,121 INFO
org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000,
Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4,
MRpartition=32, port=30032) - 2510882012-03-28 10:28:15,183 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 4 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4/_workerFinishedDir
2012-03-28 10:28:18,365 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats:
Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep =
42012-03-28 10:28:18,374 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep:
Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/3
2012-03-28 10:28:20,139 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination
of superstep 4 took 5.056 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep
5
2012-03-28 10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers:
Using algorithm static2012-03-28 10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionUtils:
analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59,
port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28
10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats:
Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max:
Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:20,224 INFO org.apache.giraph.graph.BspServiceMaster:
barrierOnWorkerList: 0 out of 64 workers finished on superstep 5 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/5/_workerFinishedDir2012-03-28
10:28:20,639 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation
found (vtx=1000000,finVtx=1000000,edges=16000000,msgCount=0) on superstep = 5
2012-03-28 10:28:20,651 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep:
Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4
2012-03-28 10:28:20,945 WARN org.apache.giraph.graph.BspServiceMaster: coordinateBarrier:
Already cleaned up /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4
2012-03-28 10:28:20,945 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination
of superstep 5 took 0.806 seconds ended with state ALL_SUPERSTEPS_DONE and is now on superstep
6
2012-03-28 10:28:20,946 INFO org.apache.giraph.graph.BspServiceMaster: setJobState: {"_stateKey":"FINISHED","_applicationAttemptKey":-1,"_superstepKey":-1}
on superstep 62012-03-28 10:28:20,987 INFO org.apache.giraph.graph.BspServiceMaster: cleanup:
Notifying master its okay to cleanup with /_hadoopBsp/job_201203281017_0001/_cleanedUpDir/0_master
2012-03-28 10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanUpZooKeeper: Node
/_hadoopBsp/job_201203281017_0001/_cleanedUpDir already exists, no need to create.2012-03-28
10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanUpZooKeeper: Got 65 of 65
desired children from /_hadoopBsp/job_201203281017_0001/_cleanedUpDir
2012-03-28 10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanupZooKeeper: Removing
the following path and all children - /_hadoopBsp/job_201203281017_00012012-03-28 10:28:24,840
INFO org.apache.giraph.graph.BspService: process: masterElectionChildrenChanged signaled
2012-03-28 10:28:25,321 INFO org.apache.giraph.graph.BspService: process: cleanedUpChildrenChanged
signaled
2012-03-28 10:28:25,757 INFO org.apache.giraph.graph.BspServiceMaster: cleanup: Removed HDFS
checkpoint directory (_bsp/_checkpoints//job_201203281017_0001) with return = true since this
job succeeded
2012-03-28 10:28:25,771 INFO org.apache.zookeeper.ZooKeeper: Session: 0x1365720ee4e0000 closed
2012-03-28 10:28:25,771 INFO org.apache.giraph.graph.MasterThread: setup: Took 20.768 seconds.
2012-03-28 10:28:25,771 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down2012-03-28
10:28:25,771 INFO org.apache.giraph.graph.MasterThread: vertex input superstep: Took 10.669
seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 0: Took 6.173
seconds.2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 1: Took
5.717 seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 2: Took 5.265
seconds.2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 3: Took
5.068 seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 4: Took 5.056
seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 5: Took 0.806
seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: shutdown: Took 4.827 seconds.
2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: total: Took 1.3329016850040002E9
seconds.
2012-03-28 10:28:25,773 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperClosedStamp:
Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/0.COMPUTATION_DONE2012-03-28
10:28:25,813 INFO org.apache.giraph.zk.ZooKeeperManager: waitUntilAllTasksDone: Got 65 and
65 desired (polling period is 3000) on attempt 02012-03-28 10:28:26,139 INFO org.apache.giraph.zk.ZooKeeperManager:
offlineZooKeeperServers: waitFor returned 143 and deleted directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper
2012-03-28 10:28:26,140 INFO org.apache.hadoop.mapred.Task: Task:attempt_201203281017_0001_m_000000_0
is done. And is in the process of commiting2012-03-28 10:28:26,455 INFO org.apache.hadoop.mapred.Task:
Task 'attempt_201203281017_0001_m_000000_0' done.
2012-03-28 10:28:26,461 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
truncater with mapRetainSize=-1 and reduceRetainSize=-1
------------------------------------------------------------------------------------------------------------------------------
one of the workers
2012-03-28 10:18:00,122 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
2012-03-28 10:18:00,387 WARN org.apache.giraph.bsp.BspOutputFormat: getOutputCommitter: Returning
ImmutableOutputCommiter (does nothing).
2012-03-28 10:18:00,397 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code
0
2012-03-28 10:18:00,405 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin
: org.apache.hadoop.util.LinuxResourceCalculatorPlugin@18330bf
2012-03-28 10:18:00,489 INFO org.apache.giraph.graph.GraphMapper: Distributed cache is empty.
Assuming fatjar.
2012-03-28 10:18:00,489 INFO org.apache.giraph.graph.GraphMapper: setup: classpath @ /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar
2012-03-28 10:18:00,498 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp:
Made the directory _bsp/_defaultZkManagerDir/job_201203281017_0001
2012-03-28 10:18:00,500 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp:
Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/tmm-e6 1
2012-03-28 10:18:00,521 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList:
For task 1, got file 'zkServerList_tmm-e10 0 ' (polling period is 3000)
2012-03-28 10:18:00,521 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList:
Found [tmm-e10, 0] 2 hosts in filename 'zkServerList_tmm-e10 0 '
2012-03-28 10:18:00,524 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers:
Got [tmm-e10] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt
0
2012-03-28 10:18:00,524 INFO org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceWorker...
2012-03-28 10:18:00,534 INFO org.apache.giraph.graph.BspService: BspService: Connecting to
ZooKeeper with job job_201203281017_0001, 1 on tmm-e10:22181
2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969,
built on 02/23/2011 22:27 GMT
2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=tmm-e6
2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_22
2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun
Microsystems Inc.
2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/local/java/jdk1.6.0_22/jre
                
> How to close all child when a job finished?
> -------------------------------------------
>
>                 Key: GIRAPH-169
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-169
>             Project: Giraph
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.2.0
>         Environment: sles 11 x64,jdk 1.6,hadoop 0.20.205.0,1 Master and 8 slaves,
>            Reporter: Jianfeng Qian
>            Priority: Minor
>
> I ran pagerank at hadoop 0.20.205.0. When the job finished,the child in slaves didn't
quit immediately and sometimes they never quit and I have to kill them. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message