flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example
Date Sat, 13 Jun 2015 23:46:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14584881#comment-14584881
] 

ASF GitHub Bot commented on FLINK-2149:
---------------------------------------

Github user vasia commented on a diff in the pull request:

    https://github.com/apache/flink/pull/770#discussion_r32374939
  
    --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java
---
    @@ -66,34 +63,47 @@ public static void main(String [] args) throws Exception {
     
     		DataSet<Edge<Long, Double>> edges = getEdgesDataSet(env);
     
    -		Graph<Long, NullValue, Double> graph = Graph.fromDataSet(edges, env);
    +		Graph<Long, HashSet<Long>, Double> graph = Graph.fromDataSet(edges,
    +				new MapFunction<Long, HashSet<Long>>() {
     
    -		DataSet<Vertex<Long, HashSet<Long>>> verticesWithNeighbors =
    -				graph.groupReduceOnEdges(new GatherNeighbors(), EdgeDirection.ALL);
    +					@Override
    +					public HashSet<Long> map(Long id) throws Exception {
    +						HashSet<Long> neighbors = new HashSet<Long>();
    +						neighbors.add(id);
     
    -		Graph<Long, HashSet<Long>, Double> graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors,
edges, env);
    +						return new HashSet<Long>(neighbors);
    +					}
    +				}, env);
     
    -		// the edge value will be the Jaccard similarity coefficient(number of common neighbors/
all neighbors)
    -		DataSet<Tuple3<Long, Long, Double>> edgesWithJaccardWeight = graphWithVertexValues.getTriplets()
    -				.map(new WeighEdgesMapper());
    +		// create the set of neighbors
    +		DataSet<Tuple2<Long, HashSet<Long>>> computedNeighbors =
    +				graph.reduceOnNeighbors(new GatherNeighbors(), EdgeDirection.ALL);
     
    -		DataSet<Edge<Long, Double>> result = graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight,
    -				new MapFunction<Tuple2<Double, Double>, Double>() {
    +		// join with the vertices to update the node values
    +		DataSet<Vertex<Long, HashSet<Long>>> verticesWithNeighbors =
    +				graph.joinWithVertices(computedNeighbors, new MapFunction<Tuple2<HashSet<Long>,
HashSet<Long>>,
    +						HashSet<Long>>() {
     
     					@Override
    -					public Double map(Tuple2<Double, Double> value) throws Exception {
    -						return value.f1;
    +					public HashSet<Long> map(Tuple2<HashSet<Long>, HashSet<Long>>
tuple2) throws Exception {
    +						return tuple2.f1;
     					}
    -				}).getEdges();
    +				}).getVertices();
    +
    +		Graph<Long, HashSet<Long>, Double> graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors,
edges, env);
    --- End diff --
    
    joinWithVertices can give you the Graph directly :)


> Simplify Gelly Jaccard similarity example
> -----------------------------------------
>
>                 Key: FLINK-2149
>                 URL: https://issues.apache.org/jira/browse/FLINK-2149
>             Project: Flink
>          Issue Type: Improvement
>          Components: Gelly
>    Affects Versions: 0.9
>            Reporter: Vasia Kalavri
>            Assignee: Andra Lungu
>            Priority: Trivial
>              Labels: easyfix, starter
>
> The Gelly Jaccard similarity example can be simplified by replacing the groupReduceOnEdges
method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message