maybe it would be better if you use mapreduce such that in the map phase each key-value pair at a node is a key and the node is the value...this way you get the first level of connections at the reduce-keys...then u can use the output of reduce phase as adjacency list for the graph to be processed using Giraph...
HiI am looking for tips on how to leverage Giraph for the use case below:I have a list of Nodes.A Node is a collection of Key-Value pairs.2 Nodes are related (have an edge) if they share a Key-Value pair.Until now I have been running a Depth First Search algorithm to cluster the Nodes into Connected Components.
However, my data set has grown significantly and I need to scale. This is the reason that brought me to Giraph.I have gone through the Connected Component example in Giraph but need a bit of help to get started. Specifically I wonder how I can change it to accommodate the use case described above.I would greatly appreciate any help.Thank you in advance.-matt