I'd start by taking HBase out of the equation.

On Thu, May 8, 2014 at 1:46 PM, Pascal Jäger <pascal@pascaljaeger.de> wrote:
Hi all,

I have implemented a label propagation algorithm to find clusters in a graph.
I just realized that the time the algorithm takes for one superstep is increasing and I don’t know why.

The graph is static and the number of messages is the same throughout all supersteps.
During every superstep each node sends its label to its neighbors which then calculate their label based on the received messages and then again send their label.
At the end of each superstep each node writes a nodeID - label pair to an HBase table.

Do you have any general hints where I can look at?

I absolutely have no clue where to start

Thanks for your help!



   Claudio Martella