The question is: do you have 100GB of mainmemory? How big are your
messages going to be? How dense is the graph?
Although we have outofcore facilities, it looks to me not like a typical
graph algorithm, and in particular not one that would particularly take
advantage of Giraph compared to MapReduce. This is because it has a low
number of iterations (two), and hence, in particular if you have memory
constraints, it could work out pretty easily with MapReduce. Also, it looks
to me like a map/reduce job, there the reducer could do the second
iterations, but I could miss some details. As far as loadbalancing is
concerned, i guess it depends on your degree distribution. Having a
"random" distribution of vertices through hashpartitioning should back you
up, but if you have a bunch of nodes that are much more active, you could
have some stranglers.
On Thu, May 2, 2013 at 2:12 AM, Hadoop Explorer
<hadoopexplorer@outlook.com>wrote:
> I have an application that evaluate a graph using this algorithm:
>
>  use a parallel for loop to evaluate all nodes in a graph (to evaluate a
> node, an image is read, and then result of this node is calculated)
>
>  use a second parallel for loop to evaluate all edges in the graph. The
> function would take in results from both nodes of the edge, and then
> calculate the answer for the edge
>
> The final result will consist of calculated results of each edge. So each
> node, and each edge is essentially a job, and in this case, an edge is more
> like a job than a message
>
> As you can see, the above algorithm would employ two map functions, but no
> reduce function. The total data size can be very large (say 100GB). Also,
> the workload of each node and each edge is highly irregular, and thus load
> balancing mechanisms are essential.
>
> In this case, will giraph suit this application? if so, how will my
> program like? And will giraph be able to strike the balance between a good
> load balancing of the second map function, and minimizing data transfer of
> the results from the first map function?
>
>
>

Claudio Martella
claudio.martella@gmail.com
