spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Larry Xiao <xia...@sjtu.edu.cn>
Subject GraphX graph partitioning strategy
Date Thu, 24 Jul 2014 06:59:26 GMT
Hi all,

I'm implementing graph partitioning strategy for GraphX, learning from 
researches on graph computing.

I have two questions:

- a specific implement question:
In current design, only vertex ID of src and dst are provided 
(PartitionStrategy.scala).
And some strategies require knowledge about the graph (like degrees) and 
can consist more than one passes to finally produce the partition ID.
So I'm changing the PartitionStrategy.getPartition API to provide more 
info, but I don't want to make it complex. (the current one looks very 
clean)

- an open question:
What advice would you give considering partitioning, considering the 
procedure Spark adopt on graph processing?

Any advice is much appreciated.

Best Regards,
Larry Xiao

Reference

Bipartite-oriented Distributed Graph Partitioning for Big Learning.
PowerLyra : Differentiated Graph Computation and Partitioning on Skewed 
Graphs

Mime
View raw message