flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3780) Jaccard Similarity
Date Fri, 20 May 2016 14:21:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293430#comment-15293430
] 

ASF GitHub Bot commented on FLINK-3780:
---------------------------------------

Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1980#discussion_r64048412
  
    --- Diff: flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/JaccardIndex.java
---
    @@ -40,9 +43,9 @@
     /**
      * Driver for the library implementation of Jaccard Index.
      *
    - * This example generates an undirected RMat graph with the given scale and
    - * edge factor then calculates all non-zero Jaccard Index similarity scores
    - * between vertices.
    + * This example reads a simple, undirected graph from a CSV file or generates
    --- End diff --
    
    remove one "generates"


> Jaccard Similarity
> ------------------
>
>                 Key: FLINK-3780
>                 URL: https://issues.apache.org/jira/browse/FLINK-3780
>             Project: Flink
>          Issue Type: New Feature
>          Components: Gelly
>    Affects Versions: 1.1.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
>             Fix For: 1.1.0
>
>
> Implement a Jaccard Similarity algorithm computing all non-zero similarity scores. This
algorithm is similar to {{TriangleListing}} but instead of joining two-paths against an edge
list we count two-paths.
> {{flink-gelly-examples}} currently has {{JaccardSimilarityMeasure}} which relies on {{Graph.getTriplets()}}
so only computes similarity scores for neighbors but not neighbors-of-neighbors.
> This algorithm is easily modified for other similarity scores such as Adamic-Adar similarity
where the sum of endpoint degrees is replaced by the degree of the middle vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message