giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pankaj Malhotra <>
Subject Re: Help on clustering connected components with Giraph
Date Fri, 28 Mar 2014 19:25:08 GMT
maybe it would be better if you use mapreduce such that in the map phase
each key-value pair at a node is a key and the node is the value...this way
you get the first level of connections at the reduce-keys...then u can use
the output of reduce phase as adjacency list for the graph to be processed
using Giraph...
On Mar 28, 2014 6:27 PM, "Matthieu Labour" <>

> Hi
> I am looking for tips on how to leverage Giraph for the use case below:
> I have a list of Nodes.
> A Node is a collection of Key-Value pairs.
> 2 Nodes are related (have an edge) if they share a Key-Value pair.
> Until now I have been running a Depth First Search algorithm to cluster
> the Nodes into Connected Components.
> However, my data set has grown significantly and I need to scale. This is
> the reason that brought me to Giraph.
> I have gone through the Connected Component example in Giraph but need a
> bit of help to get started. Specifically I wonder how I can change it to
> accommodate the use case described above.
> I would greatly appreciate any help.
> Thank you in advance.
> -matt

View raw message