giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-797) If the vertex input data path is incorrect Giraph job hangs indefinitely until killed by JobTracker for exceeding timeout
Date Fri, 22 Nov 2013 17:34:37 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830139#comment-13830139 ] 

Hudson commented on GIRAPH-797:
-------------------------------

FAILURE: Integrated in Giraph-trunk-Commit #1368 (See [https://builds.apache.org/job/Giraph-trunk-Commit/1368/])
GIRAPH-797 (claudio: http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=ba8ea976bec6fcfee1630acd2cb238a688e33b71)
* giraph-core/src/main/java/org/apache/giraph/utils/ConfigurationUtils.java
* CHANGELOG


> If the vertex input data path is incorrect Giraph job hangs indefinitely until killed by JobTracker for exceeding timeout
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-797
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-797
>             Project: Giraph
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 1.1.0
>         Environment: Mac OS X Mountain Lion, Hadoop 1.2.1
>            Reporter: Rob Vesse
>              Labels: patch
>         Attachments: GIRAPH-797.patch, job_201311181156_0003.zip
>
>
> When running a Giraph job using the {{GiraphRunner}} if the vertex input data specified by the {{-vip}} argument references a non-existent file the MapReduce job will hand indefinitely.
> Example command invocation:
> {noformat}
> bin/hadoop jar /Users/rvesse/Documents/Work/Code/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/rvesse/giraph_input/nosuchfile.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/rvesse/giraph_output/6 -w 1
> {noformat}
> And I get the following output on the command line:
> {noformat}
> 2013-11-18 12:07:04.118 java[7995:1203] Unable to load realm info from SCDynamicStore
> 13/11/18 12:07:05 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one.
> 13/11/18 12:07:05 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one.
> 13/11/18 12:07:05 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
> 13/11/18 12:07:06 INFO job.GiraphJob: run: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201311181156_0003
> 13/11/18 12:08:03 INFO job.HaltApplicationUtils$DefaultHaltInstructionsWriter: writeHaltInstructions: To halt after next superstep execute: 'bin/halt-application --zkServer mbp-rvesse.home:22181 --zkNode /_hadoopBsp/job_201311181156_0003/_haltComputation'
> 13/11/18 12:08:03 INFO mapred.JobClient: Running job: job_201311181156_0003
> 13/11/18 12:08:04 INFO mapred.JobClient:  map 50% reduce 0%
> {noformat}
> And in the Hadoop Job tracker viewing this job I see this for the first map attempt:
> {noformat}
> 2013-11-18 12:07:18,589 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2013-11-18 12:07:19,046 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2013-11-18 12:07:19,245 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : null
> 2013-11-18 12:07:19,348 INFO org.apache.hadoop.mapred.MapTask: Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1
> 2013-11-18 12:07:19,686 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsComputation
> 2013-11-18 12:07:20,475 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201311181156_0003
> 2013-11-18 12:07:20,477 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_task/mbp-rvesse.home 0
> 2013-11-18 12:07:20,490 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [mbp-rvesse.home] 1 hosts from 2 candidates when 1 required (polling period is 3000) on attempt 0
> 2013-11-18 12:07:20,490 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201311181156_0003/zkServerList_mbp-rvesse.home 0 '
> 2013-11-18 12:07:20,495 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_mbp-rvesse.home 0 ' (polling period is 3000)
> 2013-11-18 12:07:20,495 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [mbp-rvesse.home, 0] 2 hosts in filename 'zkServerList_mbp-rvesse.home 0 '
> 2013-11-18 12:07:20,496 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper
> 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper/zoo.cfg in /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper with base port 22181
> 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
> 2013-11-18 12:07:20,517 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
> 2013-11-18 12:07:20,528 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper/zoo.cfg] in directory /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/work/_bspZooKeeper
> 2013-11-18 12:07:20,571 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
> 2013-11-18 12:07:20,572 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to mbp-rvesse.home:22181 with poll msecs = 3000
> 2013-11-18 12:07:20,588 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
> java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
> 	at java.net.Socket.connect(Socket.java:527)
> 	at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:703)
> 	at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:369)
> 	at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:202)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:59)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:89)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:23,589 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to mbp-rvesse.home:22181 with poll msecs = 3000
> 2013-11-18 12:07:23,590 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to mbp-rvesse.home/192.168.1.65:22181!
> 2013-11-18 12:07:23,590 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_zkServer/mbp-rvesse.home 0
> 2013-11-18 12:07:23,597 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
> 2013-11-18 12:07:23,597 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
> 2013-11-18 12:07:23,739 INFO org.apache.giraph.bsp.BspService: BspService: Path to create to halt is /_hadoopBsp/job_201311181156_0003/_haltComputation
> 2013-11-18 12:07:23,739 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201311181156_0003, 0 on mbp-rvesse.home:22181
> 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=mbp-rvesse.home
> 2013-11-18 12:07:23,841 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_65
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Apple Inc.
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/classes:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../conf:/Library/Java/Home/lib/tools.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/..:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-collections-3.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-configuration-1.6.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hsqldb-1.8.0.10.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/oro-2.0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/native/Mac_OS_X-x86_64-64:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work/tmp
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Mac OS X
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=x86_64
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=10.8.5
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=rvesse
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/
> 2013-11-18 12:07:23,842 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000000_0/work
> 2013-11-18 12:07:23,868 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=mbp-rvesse.home:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@637050f5
> 2013-11-18 12:07:23,971 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:23,973 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to mbp-rvesse.home/192.168.1.65:22181, initiating session
> 2013-11-18 12:07:24,067 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server mbp-rvesse.home/192.168.1.65:22181, sessionid = 0x1426b1b9c7f0000, negotiated timeout = 600000
> 2013-11-18 12:07:24,091 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2013-11-18 12:07:24,171 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
> 2013-11-18 12:07:24,171 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
> 2013-11-18 12:07:24,392 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201311181156_0003/_masterElectionDir/mbp-rvesse.home_00000000000' and my bid is '/_hadoopBsp/job_201311181156_0003/_masterElectionDir/mbp-rvesse.home_00000000000'
> 2013-11-18 12:07:25,130 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2013-11-18 12:07:25,440 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: mbp-rvesse.home/192.168.1.65:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2013-11-18 12:07:25,691 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2013-11-18 12:07:25,749 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
> 2013-11-18 12:07:25,793 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
> 2013-11-18 12:07:25,801 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
> 2013-11-18 12:07:27,498 ERROR org.apache.giraph.master.MasterThread: masterThread: Master algorithm failed with IllegalStateException
> java.lang.IllegalStateException: generateVertexInputSplits: Got IOException
> 	at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:316)
> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:627)
> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:694)
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/user/rvesse/giraph_input/nosuchfile.txt
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.listStatus(GiraphFileInputFormat.java:271)
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.listVertexStatus(GiraphFileInputFormat.java:286)
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.getVertexSplits(GiraphFileInputFormat.java:357)
> 	at org.apache.giraph.io.formats.TextVertexInputFormat.getSplits(TextVertexInputFormat.java:60)
> 	at org.apache.giraph.io.internal.WrappedVertexInputFormat.getSplits(WrappedVertexInputFormat.java:72)
> 	at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:314)
> 	... 3 more
> 2013-11-18 12:07:27,498 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.master.MasterThread, msg = java.lang.IllegalStateException: generateVertexInputSplits: Got IOException, exiting...
> java.lang.IllegalStateException: java.lang.IllegalStateException: generateVertexInputSplits: Got IOException
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:185)
> Caused by: java.lang.IllegalStateException: generateVertexInputSplits: Got IOException
> 	at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:316)
> 	at org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:627)
> 	at org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:694)
> 	at org.apache.giraph.master.MasterThread.run(MasterThread.java:100)
> Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost/user/rvesse/giraph_input/nosuchfile.txt
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.listStatus(GiraphFileInputFormat.java:271)
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.listVertexStatus(GiraphFileInputFormat.java:286)
> 	at org.apache.giraph.io.formats.GiraphFileInputFormat.getVertexSplits(GiraphFileInputFormat.java:357)
> 	at org.apache.giraph.io.formats.TextVertexInputFormat.getSplits(TextVertexInputFormat.java:60)
> 	at org.apache.giraph.io.internal.WrappedVertexInputFormat.getSplits(WrappedVertexInputFormat.java:72)
> 	at org.apache.giraph.master.BspServiceMaster.generateInputSplits(BspServiceMaster.java:314)
> 	... 3 more
> 2013-11-18 12:07:27,499 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
> 2013-11-18 12:07:27,499 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
> 2013-11-18 12:07:28,065 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
> 2013-11-18 12:07:28,065 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x1426b1b9c7f0000, likely server has closed socket, closing socket connection and attempting reconnect
> {noformat}
> And this for the second map attempt:
> {noformat}
> 2013-11-18 12:07:18,998 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 2013-11-18 12:07:19,441 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2013-11-18 12:07:19,612 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : null
> 2013-11-18 12:07:19,632 INFO org.apache.hadoop.mapred.MapTask: Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1
> 2013-11-18 12:07:19,692 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2013-11-18 12:07:20,445 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsComputation
> 2013-11-18 12:07:20,475 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201311181156_0003
> 2013-11-18 12:07:20,478 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201311181156_0003/_task/mbp-rvesse.home 1
> 2013-11-18 12:07:20,489 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'null' (polling period is 3000)
> 2013-11-18 12:07:23,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_mbp-rvesse.home 0 ' (polling period is 3000)
> 2013-11-18 12:07:23,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [mbp-rvesse.home, 0] 2 hosts in filename 'zkServerList_mbp-rvesse.home 0 '
> 2013-11-18 12:07:23,492 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperSErvers: Empty directory _bsp/_defaultZkManagerDir/job_201311181156_0003/_zkServer, waiting 3000 msecs.
> 2013-11-18 12:07:26,495 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [mbp-rvesse.home] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 1
> 2013-11-18 12:07:26,496 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
> 2013-11-18 12:07:26,583 INFO org.apache.giraph.bsp.BspService: BspService: Path to create to halt is /_hadoopBsp/job_201311181156_0003/_haltComputation
> 2013-11-18 12:07:26,583 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201311181156_0003, 1 on mbp-rvesse.home:22181
> 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=mbp-rvesse.home
> 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_65
> 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Apple Inc.
> 2013-11-18 12:07:26,590 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
> 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars/classes:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/jars:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../conf:/Library/Java/Home/lib/tools.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/..:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../hadoop-core-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/asm-3.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjrt-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/aspectjtools-1.6.11.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-1.7.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-beanutils-core-1.8.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-cli-1.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-codec-1.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-collections-3.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-configuration-1.6.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-daemon-1.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-digester-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-el-1.0.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-httpclient-3.0.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-io-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-lang-2.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-1.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-logging-api-1.0.4.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-math-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/commons-net-3.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/core-3.1.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-capacity-scheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-fairscheduler-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hadoop-thriftfs-1.2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/hsqldb-1.8.0.10.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-core-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-compiler-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jasper-runtime-5.5.12.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jdeb-0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-core-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-json-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jersey-server-1.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jets3t-0.6.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jetty-util-6.1.26.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsch-0.1.42.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/junit-4.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/kfs-0.2.2.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/log4j-1.2.15.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/mockito-all-1.8.5.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/oro-2.0.8.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/servlet-api-2.5-20081211.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-api-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/slf4j-log4j12-1.4.3.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/xmlenc-0.52.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-2.1.jar:/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/jsp-2.1/jsp-api-2.1.jar
> 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/Users/rvesse/Documents/Apps/hadoop-1.2.1/libexec/../lib/native/Mac_OS_X-x86_64-64:/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work
> 2013-11-18 12:07:26,595 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work/tmp
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Mac OS X
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=x86_64
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=10.8.5
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=rvesse
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/
> 2013-11-18 12:07:26,596 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/Users/rvesse/Documents/Data/Hadoop/mapred/temp/taskTracker/rvesse/jobcache/job_201311181156_0003/attempt_201311181156_0003_m_000001_0/work
> 2013-11-18 12:07:26,598 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=mbp-rvesse.home:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@14d964af
> 2013-11-18 12:07:26,609 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:26,609 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to mbp-rvesse.home/192.168.1.65:22181, initiating session
> 2013-11-18 12:07:26,615 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server mbp-rvesse.home/192.168.1.65:22181, sessionid = 0x1426b1b9c7f0001, negotiated timeout = 600000
> 2013-11-18 12:07:26,616 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2013-11-18 12:07:26,936 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2013-11-18 12:07:26,968 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: mbp-rvesse.home/192.168.1.65:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2013-11-18 12:07:26,976 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2013-11-18 12:07:27,162 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
> 2013-11-18 12:07:27,421 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201311181156_0003/_masterJobState)
> 2013-11-18 12:07:27,424 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir already exists!
> 2013-11-18 12:07:27,426 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir already exists!
> 2013-11-18 12:07:27,442 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 and workerInfo= Worker(hostname=mbp-rvesse.home, MRtaskID=1, port=30001)
> 2013-11-18 12:07:27,842 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x1426b1b9c7f0001, likely server has closed socket, closing socket connection and attempting reconnect
> 2013-11-18 12:07:27,944 WARN org.apache.giraph.bsp.BspService: process: Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent state:Disconnected type:None path:null
> 2013-11-18 12:07:29,332 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:29,333 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:29,443 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 0, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
> 	at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360)
> 	at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:31,144 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:31,145 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:33,003 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:33,004 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:34,276 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:34,277 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:35,980 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:35,981 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:36,082 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 1, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
> 	at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360)
> 	at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:37,345 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:37,346 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:38,543 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:38,544 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:40,141 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:40,141 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:41,826 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:41,827 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:41,928 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 2, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
> 	at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360)
> 	at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:43,279 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:43,280 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:44,513 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:44,514 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:46,383 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:46,384 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:46,929 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 on superstep -1
> 2013-11-18 12:07:47,936 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:47,936 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:48,037 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 0, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
> 	at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)
> 	at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650)
> 	at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664)
> 	at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:49,210 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:49,210 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:50,851 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:50,852 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:52,704 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:52,705 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:54,744 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:54,744 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:54,846 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 1, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
> 	at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)
> 	at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650)
> 	at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664)
> 	at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:07:56,259 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:56,260 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:57,672 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:57,673 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:07:59,265 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:07:59,266 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:08:01,207 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:08:01,207 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:08:01,309 WARN org.apache.giraph.zk.ZooKeeperExt: deleteExt: Connection loss on attempt 2, waiting 5000 msecs before retrying.
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> 	at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
> 	at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)
> 	at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650)
> 	at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664)
> 	at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:08:02,441 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:08:02,441 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:08:04,086 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:08:04,087 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:08:05,945 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server mbp-rvesse.home/192.168.1.65:22181
> 2013-11-18 12:08:05,945 WARN org.apache.zookeeper.ClientCnxn: Session 0x1426b1b9c7f0001 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
> 2013-11-18 12:08:06,310 ERROR org.apache.giraph.graph.GraphTaskManager: run: Worker failure failed on another RuntimeException, original expection will be rethrown
> java.lang.IllegalStateException: deleteExt: Failed to delete /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/mbp-rvesse.home_1 after 3 tries!
> 	at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:333)
> 	at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:650)
> 	at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:664)
> 	at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:894)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-11-18 12:08:06,313 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2013-11-18 12:08:06,357 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.IllegalStateException: run: Caught an unrecoverable exception exists: Failed to check /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions after 3 tries!
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:394)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.IllegalStateException: exists: Failed to check /_hadoopBsp/job_201311181156_0003/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions after 3 tries!
> 	at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369)
> 	at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:688)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:484)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:244)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
> 	... 7 more
> 2013-11-18 12:08:06,361 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
> {noformat}
> Eventually the job times out and Hadoop kills it off but really I would expect a job to fail fast (preferably before ever launching the job) if the input does not exist.
> I'll attach full log files for reference



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message