It would indeed be interesting to evaluate. The naive CC algorithm does have a huge impact on memory, and i have the feeling that it's often a choice of feasibility more than performance, meaning that often the naive approach might not be a solution at all due to the huge memory footprint.
Given that Pregel/Giraph is network-bound, my bet is that one more iteration should be cheaper than lots of large messages.

On Tue, May 6, 2014 at 3:05 PM, Dionysis Logothetis <> wrote:
Hi guys,

One of the ways we've thought about improving the clustering implementation in the Okapi library (although we haven't implemented it yet) is the following. In the naive implementation EVERY vertex sends its friend list to ALL its neighbours. But, because in a triangle it's enough for just 1 of the vertices to detect the triangle, you can have a vertex send its friend list only to neighbors that have lower id. This way, the node with the lower id in the triangle can detect the triangle and simply notify its neighbors that they belong to one (every vertex needs to know).

Now, I think this works only for undirected graphs and it adds an extra iteration with respect to the naive implementation, but can probably reduce the amount of traffic and memory used significantly.

By taking a quick look at the algorithm by Ediger et al that Arun implemented, it uses a similar approach, that is, selectively propagating vertex ids. That one introduces more iterations, but probably reduces the amount of traffic even more which can be very beneficial. It could be interesting to compare these approaches.

On Tue, May 6, 2014 at 12:47 PM, Lukas Nalezenec <> wrote:
It has been some time and I don't remember all my comments.  I think you can replace LongArrayListWritables with class that will sort numbers inside, make a diff compression and write variable length numbers.
This might cut memory usage by 80~90% (depending on graph properties).
You could also reduce a number of memory allocations and use primitive collections.


On 6.5.2014 09:08, Arun Kumar wrote:

For Okapi ML library it is mentioned that some tuning should be done .Can you clarify that what modification we have to do?


On Tue, Mar 18, 2014 at 2:30 PM, Lukas Nalezenec <> wrote:
Check Okapi ML library from Grafos:
It needs some tuning but it will work.


On 17.3.2014 20:17, Suijian Zhou wrote:
Hi, Experts,
  Does anybody know if there are examples of implementation in giraph for clustering coefficient (counting triangles)? Thanks!

  Best Regards,

   Claudio Martella