spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shreya Agarwal <shrey...@microsoft.com>
Subject RE: Strongly Connected Components
Date Fri, 11 Nov 2016 04:15:51 GMT
Yesterday's run died sometime during the night, without any errors. Today, I am running it
using GraphFrames instead. It is still spawning new tasks, so there is progress.

From: Felix Cheung [mailto:felixcheung_m@hotmail.com]
Sent: Thursday, November 10, 2016 7:50 PM
To: user@spark.apache.org; Shreya Agarwal <shreyagr@microsoft.com>
Subject: Re: Strongly Connected Components

It is possible it is dead. Could you check the Spark UI to see if there is any progress?

_____________________________
From: Shreya Agarwal <shreyagr@microsoft.com<mailto:shreyagr@microsoft.com>>
Sent: Thursday, November 10, 2016 12:45 AM
Subject: RE: Strongly Connected Components
To: <user@spark.apache.org<mailto:user@spark.apache.org>>



Bump. Anyone? Its been running for 10 hours now. No results.

From: Shreya Agarwal
Sent: Tuesday, November 8, 2016 9:05 PM
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Strongly Connected Components

Hi,

I am running this on a graph with >5B edges and >3B edges and have 2 questions -


  1.  What is the optimal number of iterations?
  2.  I am running it for 1 iteration right now on a beefy 100 node cluster, with 300 executors
each having 30GB RAM and 5 cores. I have persisted the graph to MEMORY_AND_DISK. And it has
been running for 3 hours already. Any ideas on how to speed this up?

Regards,
Shreya


Mime
View raw message