giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: clustering coefficient (counting triangles) in giraph.
Date Tue, 06 May 2014 13:50:13 GMT
It would indeed be interesting to evaluate. The naive CC algorithm does
have a huge impact on memory, and i have the feeling that it's often a
choice of feasibility more than performance, meaning that often the naive
approach might not be a solution at all due to the huge memory footprint.
Given that Pregel/Giraph is network-bound, my bet is that one more
iteration should be cheaper than lots of large messages.


On Tue, May 6, 2014 at 3:05 PM, Dionysis Logothetis
<dlogothetis@gmail.com>wrote:

> Hi guys,
>
> One of the ways we've thought about improving the clustering
> implementation in the Okapi library (although we haven't implemented it
> yet) is the following. In the naive implementation EVERY vertex sends its
> friend list to ALL its neighbours. But, because in a triangle it's enough
> for just 1 of the vertices to detect the triangle, you can have a vertex
> send its friend list only to neighbors that have lower id. This way, the
> node with the lower id in the triangle can detect the triangle and simply
> notify its neighbors that they belong to one (every vertex needs to know).
>
> Now, I think this works only for undirected graphs and it adds an extra
> iteration with respect to the naive implementation, but can probably reduce
> the amount of traffic and memory used significantly.
>
> By taking a quick look at the algorithm by Ediger et al that Arun
> implemented, it uses a similar approach, that is, selectively propagating
> vertex ids. That one introduces more iterations, but probably reduces the
> amount of traffic even more which can be very beneficial. It could be
> interesting to compare these approaches.
>
>
>
>
>
> On Tue, May 6, 2014 at 12:47 PM, Lukas Nalezenec <
> lukas.nalezenec@firma.seznam.cz> wrote:
>
>>  Hi,
>>
>> It has been some time and I don't remember all my comments.  I think you can replace
LongArrayListWritables with class that will sort numbers inside, make a diff compression and
write variable length numbers.
>> This might cut memory usage by 80~90% (depending on graph properties).
>> You could also reduce a number of memory allocations and use primitive collections.
>>
>> Regards
>> Lukas
>>
>>
>>
>> On 6.5.2014 09:08, Arun Kumar wrote:
>>
>> Hi
>>
>>  For Okapi ML library it is mentioned that some tuning should be done
>> .Can you clarify that what modification we have to do?
>>
>>  Regards
>> Arun
>>
>>
>> On Tue, Mar 18, 2014 at 2:30 PM, Lukas Nalezenec <
>> lukas.nalezenec@firma.seznam.cz> wrote:
>>
>>>  Hi,
>>> Check Okapi ML library from Grafos:
>>> http://grafos.ml/okapi.html#collaborative-als
>>> It needs some tuning but it will work.
>>>
>>> Regards
>>> Lukas
>>>
>>>
>>>
>>> On 17.3.2014 20:17, Suijian Zhou wrote:
>>>
>>>  Hi, Experts,
>>>   Does anybody know if there are examples of implementation in giraph
>>> for clustering coefficient (counting triangles)? Thanks!
>>>
>>>    Best Regards,
>>>    Suijian
>>>
>>>
>>>
>>
>>
>


-- 
   Claudio Martella

Mime
View raw message