giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pascal J├Ąger <>
Subject Aggregators as
Date Tue, 11 Feb 2014 09:24:35 GMT
Hi all,

I have some questions about the use of aggregators. I want to implement an algorithms for
Community Detection and Community Tracking.
The Community Detection algorithm basically  outputs a file where each line represents a community
and contains the IDs of the nodes in the community.

For the Community Tracking part I cluster a second graph (i.e. another time step) and then
need to compare each community of the first time step with every community of the second time
step.  For large graphs the number of communities can get quite large as well.

One idea I had was to register an aggregator for each community of the first time step and
then for each community found in the second time step one node of each community send a message
to the aggregator containing the nodes of its community. The aggregator the calculates the
similarity for each received community of time step 2.
I would end up registering several thousand aggregators I only need after one superstep.

The other idea was to alter the compute method for the node with the smallest ID in each community
and let them do the similarity calculation. This the means I would have to add (and later
remove) some thousand edges to the graph.

What do you think would perform better? Or should I do the calculation outside of giraph?

I appreciate any input.



View raw message