giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Labour <>
Subject Help on clustering connected components with Giraph
Date Fri, 28 Mar 2014 16:26:53 GMT

I am looking for tips on how to leverage Giraph for the use case below:

I have a list of Nodes.
A Node is a collection of Key-Value pairs.
2 Nodes are related (have an edge) if they share a Key-Value pair.

Until now I have been running a Depth First Search algorithm to cluster the
Nodes into Connected Components.

However, my data set has grown significantly and I need to scale. This is
the reason that brought me to Giraph.

I have gone through the Connected Component example in Giraph but need a
bit of help to get started. Specifically I wonder how I can change it to
accommodate the use case described above.

I would greatly appreciate any help.
Thank you in advance.

View raw message