giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: DataStreamer Exception - LeaseExpiredException
Date Sat, 11 Jan 2014 01:45:33 GMT
This looks more like the Zookeeper/YARN issues mentioned in the past.  
Unfortunately, I do not have a YARN instance to test this with.  Does 
anyone else have any insights here?

On 1/10/14 1:48 PM, Kristen Hardwick wrote:
> Hi all, I'm requesting help again! I'm trying to get this 
> SimpleShortestPathsComputation example working, but I'm stuck again. 
> Now the job begins to run and seems to work until the final step (it 
> performs 3 supersteps), but the overall job is failing.
>
> In the master, among other things, I see:
>
> ...
> 14/01/10 15:04:17 INFO master.MasterThread: setup: Took 0.87 seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: input superstep: Took 
> 0.708 seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: superstep 0: Took 0.158 
> seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: superstep 1: Took 0.344 
> seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: superstep 2: Took 0.064 
> seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: shutdown: Took 0.162 seconds.
> 14/01/10 15:04:17 INFO master.MasterThread: total: Took 2.31 seconds.
> 14/01/10 15:04:17 INFO yarn.GiraphYarnTask: Master is ready to commit 
> final job output data.
> 14/01/10 15:04:18 INFO yarn.GiraphYarnTask: Master has committed the 
> final job output data.
> ...
>
> To me, that looks promising - like the job was successful. However, in 
> the WORKER_ONLY containers, I see these things:
>
> ...
> 14/01/10 15:04:17 INFO graph.GraphTaskManager: cleanup: Starting for 
> WORKER_ONLY
> 14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and 
> unprocessed event 
> (path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/1/_addressesAndPartitions,

> type=NodeDeleted, state=SyncConnected)
> 14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent : 
> partitionExchangeChildrenChanged (at least one worker is done sending 
> partitions)
> 14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and 
> unprocessed event 
> (path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/1/_superstepFinished,

> type=NodeDeleted, state=SyncConnected)
> 14/01/10 15:04:17 INFO netty.NettyClient: stop: reached wait 
> threshold, 1 connections closed, releasing NettyClient.bootstrap 
> resources now.
> 14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent: Job 
> state changed, checking to see if it needs to restart
> 14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state already 
> exists 
> (/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState)
> 14/01/10 15:04:17 INFO yarn.GiraphYarnTask: [STATUS: task-1] 
> saveVertices: Starting to save 2 vertices using 1 threads
> 14/01/10 15:04:17 INFO worker.BspServiceWorker: saveVertices: Starting 
> to save 2 vertices using 1 threads
> 14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent: Job 
> state changed, checking to see if it needs to restart
> 14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state already 
> exists 
> (/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState)
> 14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state path is 
> empty! - 
> /_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState
> 14/01/10 15:04:17 ERROR zookeeper.ClientCnxn: Error while calling watcher
> java.lang.NullPointerException
>         at java.io.StringReader.<init>(StringReader.java:50)
>         at org.json.JSONTokener.<init>(JSONTokener.java:66)
>         at org.json.JSONObject.<init>(JSONObject.java:402)
>         at 
> org.apache.giraph.bsp.BspService.getJobState(BspService.java:716)
>         at 
> org.apache.giraph.worker.BspServiceWorker.processEvent(BspServiceWorker.java:1563)
>         at org.apache.giraph.bsp.BspService.process(BspService.java:1095)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>         at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and 
> unprocessed event 
> (path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_vertexInputSplitsAllReady,

> type=NodeDeleted, state=SyncConnected)
> 14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and 
> unprocessed event 
> (path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/2/_addressesAndPartitions,

> type=NodeDeleted, state=SyncConnected)
> 14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent : 
> partitionExchangeChildrenChanged (at least one worker is done sending 
> partitions)
> 14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and 
> unprocessed event 
> (path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/2/_superstepFinished,

> type=NodeDeleted, state=SyncConnected)
> ...
> 14/01/10 15:04:17 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):

> No lease on 
> /user/spry/Shortest/_temporary/1/_temporary/attempt_1389300168420_0024_m_000001_1/part-m-00001:

> File does not exist. Holder DFSClient_NONMAPREDUCE_-643344145_1 does 
> not have any open files.
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2755)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2567)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2480)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
> ...
>
> I apologize for the wall of error message, but I tried to leave in at 
> least some of the parts that might be useful. I put the entire YARN 
> log here: http://tny.cz/af229738
>
> Has anyone ever seen this before? This is the command I'm using to run:
>
> hadoop jar 
> giraph-core/target/giraph-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar 
> org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=false 
> -Dgiraph.zkList="localhost:2181" -Dgiraph.zkSessionMsecTimeout=600000 
> -Dgiraph.useInputSplitLocality=false 
> org.apache.giraph.examples.SimpleShortestPathsComputation -vif 
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip 
> /user/spry/input -vof 
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op 
> /user/spry/Shortest -w 1
>
> My setup is still the same as the other email if you saw it:
>
> I compiled Giraph with this command, and everything built successfully 
> except "Apache Giraph Distribution" which it doesn't seem like I need:
>
> mvn -Phadoop_yarn -Dhadoop.version=2.2.0 -DskipTests clean package
>
> I am running with the following components:
>
> Single node cluster
> Giraph 1.1
> Hadoop 2.2.0 (Hortonworks)
> Java 1.7.0_45
>
> Thanks in advance,
> -Kristen Hardwick
>


Mime
View raw message