flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2044) Implementation of Gelly HITS Algorithm
Date Mon, 02 May 2016 16:39:13 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266941#comment-15266941
] 

ASF GitHub Bot commented on FLINK-2044:
---------------------------------------

Github user greghogan commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1956#discussion_r61764033
  
    --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/HITSAlgorithm.java
---
    @@ -0,0 +1,194 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.flink.graph.library;
    +
    +import org.apache.flink.api.common.aggregators.DoubleSumAggregator;
    +import org.apache.flink.api.java.DataSet;
    +
    +import org.apache.flink.graph.Edge;
    +import org.apache.flink.graph.EdgeDirection;
    +import org.apache.flink.graph.Graph;
    +import org.apache.flink.graph.GraphAlgorithm;
    +import org.apache.flink.graph.Vertex;
    +import org.apache.flink.graph.spargel.MessageIterator;
    +import org.apache.flink.graph.spargel.MessagingFunction;
    +import org.apache.flink.graph.spargel.ScatterGatherConfiguration;
    +import org.apache.flink.graph.spargel.VertexUpdateFunction;
    +import org.apache.flink.types.DoubleValue;
    +
    +/**
    + * This is an implementation of HITS algorithm, using a scatter-gather iteration.
    + * The user can define the maximum number of iterations. HITS algorithm is determined
by two parameters,
    + * hubs and authorities. A good hub represented a page that pointed to many other pages,
and a good authority
    + * represented a page that was linked by many different hubs. The implementation assumes
that the two value on
    + * every vertex are the same at the beginning.
    + * <p>
    + * If the number of vertices of the input graph is known, it should be provided as a
parameter
    + * to speed up computation. Otherwise, the algorithm will first execute a job to count
the vertices.
    + */
    +public class HITSAlgorithm<K> implements GraphAlgorithm<K, Double, Double, DataSet<Vertex<K,
Double>>> {
    +
    +	public static enum HITSParameter {
    +		HUB,
    +		AUTHORITY
    +	}
    +
    +	private int maxIterations;
    +	private long numberOfVertices;
    +
    +	/**
    +	 * Creates an instance of HITS algorithm.
    +	 * If the number of vertices of the input graph is known,
    +	 * use the {@link HITSAlgorithm#HITSAlgorithm(int, long, HITSParameter)} constructor
instead.
    +	 *
    +	 * @param maxIterations the maximum number of iterations
    +	 * @param hitsParameter the type of final web pages users want to get by this algorithm
    +	 */
    +	public HITSAlgorithm(int maxIterations, HITSParameter hitsParameter) {
    +		if (hitsParameter == HITSParameter.AUTHORITY) {
    +			this.maxIterations = maxIterations * 2;
    +		} else {
    +			this.maxIterations = maxIterations * 2 + 1;
    +		}
    +	}
    +
    +	/**
    +	 * Creates an instance of HITS algorithm.
    +	 * If the number of vertices of the input graph is unknown,
    +	 * use the {@link HITSAlgorithm#HITSAlgorithm(int, HITSParameter)} constructor instead.
    +	 *
    +	 * @param maxIterations the maximum number of iterations
    +	 * @param hitsParameter the type of final web pages users want to get by this algorithm
    +	 */
    +	public HITSAlgorithm(int maxIterations, long numberOfVertices, HITSParameter hitsParameter)
{
    +		if (hitsParameter == HITSParameter.AUTHORITY) {
    +			this.maxIterations = maxIterations * 2;
    +		} else {
    --- End diff --
    
    Can replace 82:87 with `super(maxIterations, hitsParameter);`.


> Implementation of Gelly HITS Algorithm
> --------------------------------------
>
>                 Key: FLINK-2044
>                 URL: https://issues.apache.org/jira/browse/FLINK-2044
>             Project: Flink
>          Issue Type: New Feature
>          Components: Gelly
>            Reporter: Ahamd Javid
>            Assignee: GaoLun
>            Priority: Minor
>
> Implementation of Hits Algorithm in Gelly API using Java. the feature branch can be found
here: (https://github.com/JavidMayar/flink/commits/HITS)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message