flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shivani Ghatge (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-2375) Add Approximate Adamic Adar Similarity method using BloomFilters
Date Fri, 17 Jul 2015 14:27:04 GMT

     [ https://issues.apache.org/jira/browse/FLINK-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shivani Ghatge updated FLINK-2375:
----------------------------------
    Description: 


Just as Jaccard, the Adamic-Adar algorithm measures the similarity between a set of nodes.
However, instead of counting the common neighbors and dividing them by the total number of
neighbors, the similarity is weighted according to the vertex degrees. In particular, it's
equal to log(1/numberOfEdges).

The Adamic-Adar algorithm can be broken into three steps:
1). For each vertex, compute the log of its inverse degrees (with the formula above) and set
it as the vertex value.
2). Each vertex will then send this new computed value along with a list of neighbors to the
targets of its out-edges
3). Weigh the edges with the Adamic-Adar index: Sum over n from CN of log(1/k_n)(CN is the
set of all common neighbors of two vertices x, y. k_n is the degree of node n). See [2]

Using BloomFilters we increase the scalability of the algorithm. The values calculated for
the edges will be approximate.

Prerequisites:

    Full understanding of the Jaccard Similarity Measure algorithm
    Reading the associated literature:
    [1] http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf
    [2] http://stackoverflow.com/questions/22565620/fast-algorithm-to-compute-adamic-adar



> Add Approximate Adamic Adar Similarity method using BloomFilters
> ----------------------------------------------------------------
>
>                 Key: FLINK-2375
>                 URL: https://issues.apache.org/jira/browse/FLINK-2375
>             Project: Flink
>          Issue Type: Task
>          Components: Gelly
>            Reporter: Shivani Ghatge
>            Assignee: Shivani Ghatge
>            Priority: Minor
>
> Just as Jaccard, the Adamic-Adar algorithm measures the similarity between a set of nodes.
However, instead of counting the common neighbors and dividing them by the total number of
neighbors, the similarity is weighted according to the vertex degrees. In particular, it's
equal to log(1/numberOfEdges).
> The Adamic-Adar algorithm can be broken into three steps:
> 1). For each vertex, compute the log of its inverse degrees (with the formula above)
and set it as the vertex value.
> 2). Each vertex will then send this new computed value along with a list of neighbors
to the targets of its out-edges
> 3). Weigh the edges with the Adamic-Adar index: Sum over n from CN of log(1/k_n)(CN is
the set of all common neighbors of two vertices x, y. k_n is the degree of node n). See [2]
> Using BloomFilters we increase the scalability of the algorithm. The values calculated
for the edges will be approximate.
> Prerequisites:
>     Full understanding of the Jaccard Similarity Measure algorithm
>     Reading the associated literature:
>     [1] http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf
>     [2] http://stackoverflow.com/questions/22565620/fast-algorithm-to-compute-adamic-adar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message