giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From RainShine <rainshin...@googlemail.com>
Subject Re: Giraph job can not finish last superstep
Date Sat, 25 Oct 2014 22:37:54 GMT
  
  

Hello again,


its sad to not find any solutions for this problem. I already applied several patches which
at least looked like they might be promising including the ones from Giraph-806 (https://issues.apache.org/jira/browse/GIRAPH-806).
I also tried to use the version from the git repository and found that functionality just
as the „getCurrentSuperstep()“ is missing there which i use in all my algorithmns and
vertex implementations.


I also tried compiling with different maven profiles as our hadoop version (1.1.1) which we
use in out cluster is never directly addressed as compatible.


Funny thing is, that the algortihmn works fine when i reduce the input size and do not use
out-of-core graphs.


I would love to use giraph, but this issue is eating time like nothing. So if anybody knows
something that could help: i would really appreciate this.


Best regards,
Frank



> On Oct 23, 2014, at 2:02 PM, RainShine79 <rainshine79@googlemail.com> wrote:
> 
> 
> Hello all,
> 
> 
> i have a giraph job which seems to executed successfully: in the logs and on the hadoop
webinterface i can see that all supersteps are executed successfully. The only problem i got
is that the output seems to not get written to hdfs. 
> 
> 
> As far as i know from personal research from prior postings on this mailing list, there
is some problem with 
> a) the out-of-core feature which i need to use to be able to load all the data 
> and
> b) the output of the results to hdfs. 
> 
> 
> I currently use the latest stable version 1.0.0.
> 
> 
> Here is the log of one exemplary worker:
> 2014-10-23 13:42:10,107 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxEdgesPerTransfer = 80000
> 2014-10-23 13:42:10,108 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 56 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-56_vertices
> 2014-10-23 13:42:10,270 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 0 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-0_vertices
> 2014-10-23 13:42:10,435 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 16 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-16_vertices
> 2014-10-23 13:42:10,600 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 32 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-32_vertices
> 2014-10-23 13:42:10,761 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 48 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-48_vertices
> 2014-10-23 13:42:10,927 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 8 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-8_vertices
> 2014-10-23 13:42:11,245 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 24 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-24_vertices
> 2014-10-23 13:42:11,432 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 40 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-40_vertices
> 2014-10-23 13:42:11,619 INFO org.apache.giraph.graph.ComputeCallable: call: Computation
took 1.5131937 secs for 8 partitions on superstep 2.  Flushing started
> 2014-10-23 13:42:11,620 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep:
Waiting on all requests, superstep 2 Memory (free/total/max) = 1107.35M / 1358.25M / 9344.00M
> 2014-10-23 13:42:11,621 INFO org.apache.giraph.comm.netty.NettyClient: waitAllRequests:
Finished all requests. MBytes/sec sent = 0.0005, MBytes/sec received = 0.0001, MBytesSent
= 0.0007, MBytesReceived = 0.0001, ave sent req MBytes = 0.0001, ave received req MBytes =
0, secs waited = 1.519
> 2014-10-23 13:42:11,621 INFO org.apache.giraph.worker.WorkerAggregatorHandler: finishSuperstep:
Start gathering aggregators, workers will send their aggregated values once they are done
with superstep computation
> 2014-10-23 13:42:11,834 INFO org.apache.giraph.comm.netty.NettyClient: waitAllRequests:
Finished all requests. MBytes/sec sent = 0.0119, MBytes/sec received = 0.0062, MBytesSent
= 0, MBytesReceived = 0, ave sent req MBytes = 0, ave received req MBytes = 0, secs waited
= 0.002
> 2014-10-23 13:42:11,834 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep:
Superstep 2, messages = 0 Memory (free/total/max) = 1105.09M / 1358.25M / 9344.00M
> 2014-10-23 13:42:11,869 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep:
(waiting for rest of workers) WORKER_ONLY - Attempt=0, Superstep=2
> 2014-10-23 13:42:11,887 INFO org.apache.giraph.bsp.BspService: process: superstepFinished
signaled
> 2014-10-23 13:42:11,895 INFO org.apache.giraph.worker.BspServiceWorker: finishSuperstep:
Completed superstep 2 with global stats (vtx=538312,finVtx=0,edges=35261,msgCount=35261,haltComputation=true)
> 2014-10-23 13:42:11,895 INFO org.apache.giraph.graph.GraphTaskManager: execute: BSP application
done (global vertices marked done)
> 2014-10-23 13:42:11,896 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting
for WORKER_ONLY
> 2014-10-23 13:42:11,903 INFO org.apache.giraph.comm.netty.NettyClient: stop: reached
wait threshold, 8 connections closed, releasing NettyClient.bootstrap resources now.
> 2014-10-23 13:42:11,905 INFO org.apache.giraph.worker.BspServiceWorker: saveVertices:
Starting to save 66998 vertices using 1 threads
> 2014-10-23 13:42:11,987 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed
event (path=/_hadoopBsp/job_201410130927_0282/_applicationAttemptsDir/0/_superstepDir/1/_addressesAndPartitions,
type=NodeDeleted, state=SyncConnected)
> 2014-10-23 13:42:11,994 INFO org.apache.giraph.partition.DiskBackedPartitionStore: offloadPartition:
writing partition vertices 56 to /user/bmacek/_giraph/partitions/job_201410130927_0282/partition-56_vertices
> 2014-10-23 13:42:12,003 INFO org.apache.giraph.worker.BspServiceWorker: processEvent
: partitionExchangeChildrenChanged (at least one worker is done sending partitions)
> 2014-10-23 13:42:12,128 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed
event (path=/_hadoopBsp/job_201410130927_0282/_applicationAttemptsDir/0/_superstepDir/1/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
> 2014-10-23 13:42:12,229 INFO org.apache.giraph.worker.BspServiceWorker: processEvent:
Job state changed, checking to see if it needs to restart
> 2014-10-23 13:42:12,245 INFO org.apache.giraph.bsp.BspService: getJobState: Job state
already exists (/_hadoopBsp/job_201410130927_0282/_masterJobState)
> 2014-10-23 13:43:11,907 INFO org.apache.giraph.utils.ProgressableUtils: waitFor: Future
result not ready yet java.util.concurrent.FutureTask@3c9c7728
> 2014-10-23 13:43:11,907 INFO org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting
for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@25b43d0d
> 2014-10-23 13:44:11,907 INFO org.apache.giraph.utils.ProgressableUtils: waitFor: Future
result not ready yet java.util.concurrent.FutureTask@3c9c7728
> 2014-10-23 13:44:11,908 INFO org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting
for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@25b43d0d
> 2014-10-23 13:45:11,908 INFO org.apache.giraph.utils.ProgressableUtils: waitFor: Future
result not ready yet java.util.concurrent.FutureTask@3c9c7728
> 
> 
> … this continues forever. 
> 
> 
> 
> Is there some patch i can use to fix the issue or do i have to work on the current trunk?
In case i have to use the most recent sources: how are the new interfaces (abstract classes)
called which i need to implement (extend)? 
> 
> 
> 
> Thanks for your help in advance,
> Frank
> 

Mime
View raw message