giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuanyuan Tian <>
Subject Question about range partitioner and data locality
Date Thu, 24 May 2012 00:36:09 GMT

I want to use better partitions of input graph for my algorithm running on 
Giraph. So, I partitioned my input graph and re-labeled the vertex ids so 
that vertex ids of the same partition are in a consecutive range. I also 
reorganized the input file so that the vertices in the same range are 
together. I used the range partitioner for the Giraph job to utilize the 
better partitions. However, the vertex loader still looks for the 
partition id of each vertex and ship it to the worker that owns the 
partition. On the other hand, I have already prepared my data in a nice 
way, in the ideal case, I can just keep all the vertices of an inputsplit 
local to the corresponding worker. Is there an easy way to do this? I know 
that in the very old version of giraph, giraph doesn't have a partitioner. 
The users have to prepare the partitions. I essentially want to do a 
similar thing in the current version of giraph. Please give me a pointer 
or two on how to do this.


View raw message