[FLINK5597] [docs] Improve the LocalClusteringCoefficient documentation
Update the documentation for Gelly's library methods with improved
algorithm descriptions and explanation of algorithm results.
This closes #3404
Project: http://gitwipus.apache.org/repos/asf/flink/repo
Commit: http://gitwipus.apache.org/repos/asf/flink/commit/cb9e409b
Tree: http://gitwipus.apache.org/repos/asf/flink/tree/cb9e409b
Diff: http://gitwipus.apache.org/repos/asf/flink/diff/cb9e409b
Branch: refs/heads/master
Commit: cb9e409b764f95e07441a0c8da6c24e21bc1564b
Parents: ea14053
Author: Greg Hogan <code@greghogan.com>
Authored: Thu Feb 23 16:26:45 2017 0500
Committer: Greg Hogan <code@greghogan.com>
Committed: Thu Mar 2 16:42:58 2017 0500

docs/dev/libs/gelly/library_methods.md  92 ++++++++++
.../directed/LocalClusteringCoefficient.java  4 +
.../clustering/directed/TriadicCensus.java  2 +
.../clustering/directed/TriangleListing.java  8 +
.../undirected/LocalClusteringCoefficient.java  4 +
.../clustering/undirected/TriadicCensus.java  4 +
.../clustering/undirected/TriangleListing.java  15 ++
.../flink/graph/library/link_analysis/HITS.java  4 +
.../graph/library/similarity/AdamicAdar.java  6 +
.../graph/library/similarity/JaccardIndex.java  10 +
.../graph/utils/proxy/OptionalBoolean.java  2 +
11 files changed, 79 insertions(+), 72 deletions()

http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/docs/dev/libs/gelly/library_methods.md

diff git a/docs/dev/libs/gelly/library_methods.md b/docs/dev/libs/gelly/library_methods.md
index 94eee2e..026bea4 100644
 a/docs/dev/libs/gelly/library_methods.md
+++ b/docs/dev/libs/gelly/library_methods.md
@@ 166,18 +166,6 @@ The algorithm is implemented using [gathersumapply iterations](#gathersumapp
See the [Single Source Shortest Paths](#singlesourceshortestpaths) library method for
implementation details and usage information.
## Triangle Count

#### Overview
An analytic for counting the number of unique triangles in a graph.

#### Details
Counts the triangles generated by [Triangle Listing](#trianglelisting).

#### Usage
The analytic takes an undirected graph as input and returns as a result a `Long` corresponding
to the number of triangles
in the graph. The graph ID type must be `Comparable` and `Copyable`.

## Triangle Enumerator
#### Overview
@@ 229,11 +217,12 @@ neighbors) to 1.0 (complete graph).
#### Details
See the [Local Clustering Coefficient](#localclusteringcoefficient) library method for
a detailed explanation of
clustering coefficient.
+clustering coefficient. The Average Clustering Coefficient is the average of the Local Clustering
Coefficient scores
+over all vertices with at least two neighbors. Each vertex, independent of degree, has equal
weight for this score.
#### Usage
Directed and undirected variants are provided. The algorithm takes a simple graph as input
and outputs a result
containing the total number of vertices and average local clustering coefficient of the graph.
The graph ID type must be
+Directed and undirected variants are provided. The analytics take a simple graph as input
and output an `AnalyticResult`
+containing the total number of vertices and average clustering coefficient of the graph.
The graph ID type must be
`Comparable` and `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
@@ 246,12 +235,14 @@ neighbors) to 1.0 (complete graph).
#### Details
See the [Local Clustering Coefficient](#localclusteringcoefficient) library method for
a detailed explanation of
clustering coefficient.
+clustering coefficient. The Global Clustering Coefficient is the ratio of connected neighbors
over the entire graph.
+Vertices with higher degrees have greater weight for this score because the count of neighbor
pairs is quadratic in
+degree.
#### Usage
Directed and undirected variants are provided. The algorithm takes a simple graph as input
and outputs a result
containing the total number of triplets and triangles in the graph. The graph ID type must
be `Comparable` and
`Copyable`.
+Directed and undirected variants are provided. The analytics take a simple graph as input
and output an `AnalyticResult`
+containing the total number of triplets and triangles in the graph. The result class provides
a method to compute the
+global clustering coefficient score. The graph ID type must be `Comparable` and `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
@@ 262,19 +253,20 @@ The local clustering coefficient measures the connectedness of each
vertex's nei
edges between neighbors) to 1.0 (neighborhood is a clique).
#### Details
An edge between a vertex's neighbors is a triangle. Counting edges between neighbors is equivalent
to counting the
+An edge between neighbors of a vertex is a triangle. Counting edges between neighbors is
equivalent to counting the
number of triangles which include the vertex. The clustering coefficient score is the number
of edges between neighbors
divided by the number of potential edges between neighbors.
See the [Triangle Enumerator](#triangleenumerator) library method for a detailed explanation
of triangle enumeration.
+See the [Triangle Listing](#trianglelisting) library method for a detailed explanation of
triangle enumeration.
#### Usage
Directed and undirected variants are provided. The algorithms take a simple graph as input
and outputs a `DataSet` of
tuples containing the vertex ID, vertex degree, and number of triangles containing the vertex.
The graph ID type must be
`Comparable` and `Copyable`.
+Directed and undirected variants are provided. The algorithms take a simple graph as input
and output a `DataSet` of
+`UnaryResult` containing the vertex ID, vertex degree, and number of triangles containing
the vertex. The result class
+provides a method to compute the local clustering coefficient score. The graph ID type must
be `Comparable` and
+`Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
* `setIncludeZeroDegreeVertices`: include results for vertices with a degree of zero
+* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
### Triadic Census
@@ 284,29 +276,35 @@ or unconnected. The [Triadic Census](http://vlado.fmf.unilj.si/pub/networks/doc
occurrences of each type of triad with the graph.
#### Details
The analytics counts the four undirected triad types (formed with 0, 1, 2, or 3 connecting
edges) or 16 directed triad
+This analytic counts the four undirected triad types (formed with 0, 1, 2, or 3 connecting
edges) or 16 directed triad
types by counting the triangles from [Triangle Listing](#trianglelisting) and running [Vertex
Metrics](#vertexmetrics)
to obtain the number of triplets and edges. Triangle counts are then deducted from triplet
counts, and triangle and
triplet counts are removed from edge counts.
#### Usage
Directed and undirected variants are provided. The analytics take a simple graph as input
and outputs a `Result` with
accessor methods for the computed statistics. The graph ID type must be `Comparable` and
`Copyable`.
+Directed and undirected variants are provided. The analytics take a simple graph as input
and output an
+`AnalyticResult` with accessor methods for querying the count of each triad type. The graph
ID type must be
+`Comparable` and `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
### Triangle Listing
#### Overview
This algorithm supports object reuse. The graph ID type must be `Comparable` and `Copyable`.
+Enumerates all triangles in the graph. A triangle is composed of three edges connecting three
vertices into cliques of
+size 3.
#### Details
See the [Triangle Enumerator](#triangleenumerator) library method for implementation details.
+Triangles are listed by joining open triplets (two edges with a common neighbor) against
edges on the triplet endpoints.
+This implementation uses optimizations from
+[Schank's algorithm](http://i11www.iti.unikarlsruhe.de/extra/publications/swfclt05_t.pdf)
to improve performance with
+highdegree vertices. Triplets are generated from the lowest degree vertex since each triangle
need only be listed once.
+This greatly reduces the number of generated triplets which is quadratic in vertex degree.
#### Usage
Directed and undirected variants are provided. The algorithms take a simple graph as input
and outputs a `DataSet` of
tuples containing the three triangle vertices and, for the directed algorithm, a bitmask
marking each of the six
potential triangle edges. The graph ID type must be `Comparable` and `Copyable`.
+Directed and undirected variants are provided. The algorithms take a simple graph as input
and output a `DataSet` of
+`TertiaryResult` containing the three triangle vertices and, for the directed algorithm,
a bitmask marking each of the
+six potential edges connecting the three vertices. The graph ID type must be `Comparable`
and `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
* `setSortTriangleVertices`: normalize the triangle listing such that for each result (K0,
K1, K2) the vertex IDs are sorted K0 < K1 < K2
@@ 324,11 +322,13 @@ good authorities and good authorities are those pointed to by many good
hubs.
Every vertex is assigned the same initial hub and authority scores. The algorithm then iteratively
updates the scores
until termination. During each iteration new hub scores are computed from the authority scores,
then new authority
scores are computed from the new hub scores. The scores are then normalized and optionally
tested for convergence.
+HITS is similar to [PageRank](#pagerank) but vertex scores are emitted in full to each neighbor
whereas in PageRank
+the vertex score is first divided by the number of neighbors.
#### Usage
The algorithm takes a directed graph as input and outputs a `DataSet` of `Tuple3` containing
the vertex ID, hub score,
and authority score. Termination is configured with a maximum number of iterations and/or
a convergence threshold
on the sum of the change in each score for each vertex between iterations.
+The algorithm takes a simple directed graph as input and outputs a `DataSet` of `UnaryResult`
containing the vertex ID,
+hub score, and authority score. Termination is configured by the number of iterations and/or
a convergence threshold on
+the iteration sum of the change in scores over all vertices.
* `setParallelism`: override the operator parallelism
@@ 377,8 +377,8 @@ The statistics are computed over vertex degrees generated from `degree.annotate.
`degree.annotate.undirected.VertexDegree`.
#### Usage
Directed and undirected variants are provided. The analytics take a simple graph as input
and output a `Result` with
accessor methods for the computed statistics. The graph ID type must be `Comparable`.
+Directed and undirected variants are provided. The analytics take a simple graph as input
and output an `AnalyticResult`
+with accessor methods for the computed statistics. The graph ID type must be `Comparable`.
* `setIncludeZeroDegreeVertices`: include results for vertices with a degree of zero
* `setParallelism`: override the operator parallelism
@@ 398,8 +398,8 @@ The statistics are computed over edge degrees generated from `degree.annotate.di
`degree.annotate.undirected.EdgeDegreePair` and grouped by vertex.
#### Usage
Directed and undirected variants are provided. The analytics take a simple graph as input
and output a `Result` with
accessor methods for the computed statistics. The graph ID type must be `Comparable`.
+Directed and undirected variants are provided. The analytics take a simple graph as input
and output an `AnalyticResult`
+with accessor methods for the computed statistics. The graph ID type must be `Comparable`.
* `setParallelism`: override the operator parallelism
* `setReduceOnTargetId` (undirected only): the degree can be counted from either the edge
source or target IDs. By default the source IDs are counted. Reducing on target IDs may optimize
the algorithm if the input edge list is sorted by target ID
@@ 416,13 +416,13 @@ influential to each pair of neighbors.
#### Details
The algorithm first annotates each vertex with the inverse of the logarithm of the vertex
degree then joins this score
onto edges by source vertex. Grouping on the source vertex, each pair of neighbors is emitted
with the vertex score.
Grouping on twopaths, the AdamicAdar score is summed.
+Grouping on vertex pairs, the AdamicAdar score is summed.
See the [Jaccard Index](#jaccardindex) library method for a similar algorithm.
#### Usage
The algorithm takes a simple, undirected graph as input and outputs a `DataSet` of tuples
containing two vertex IDs and
the AdamicAdair similarity score. The graph ID type must be `Comparable` and `Copyable`.
+The algorithm takes a simple undirected graph as input and outputs a `DataSet` of `BinaryResult`
containing two vertex
+IDs and the AdamicAdar similarity score. The graph ID type must be `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
* `setMinimumRatio`: filter out AdamicAdar scores less than the given ratio times the average
score
@@ 441,12 +441,12 @@ distinct neighbors is computed by storing the sum of degrees of the
vertex pair
neighbors, which are doublecounted in the sum of degrees.
The algorithm first annotates each edge with the target vertex's degree. Grouping on the
source vertex, each pair of
neighbors is emitted with the degree sum. Grouping on twopaths, the shared neighbors are
counted.
+neighbors is emitted with the degree sum. Grouping on vertex pairs, the shared neighbors
are counted.
#### Usage
The algorithm takes a simple, undirected graph as input and outputs a `DataSet` of tuples
containing two vertex IDs,
+The algorithm takes a simple undirected graph as input and outputs a `DataSet` of tuples
containing two vertex IDs,
the number of shared neighbors, and the number of distinct neighbors. The result class provides
a method to compute the
Jaccard Index score. The graph ID type must be `Comparable` and `Copyable`.
+Jaccard Index score. The graph ID type must be `Copyable`.
* `setLittleParallelism`: override the parallelism of operators processing small amounts
of data
* `setMaximumScore`: filter out Jaccard Index scores greater than or equal to the given maximum
fraction
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/LocalClusteringCoefficient.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/LocalClusteringCoefficient.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/LocalClusteringCoefficient.java
index cad36b3..ffd4b13 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/LocalClusteringCoefficient.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/LocalClusteringCoefficient.java
@@ 44,11 +44,11 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
* The local clustering coefficient measures the connectedness of each vertex's
* neighborhood. Scores range from 0.0 (no edges between neighbors) to 1.0
* (neighborhood is a clique).
 * <br/>
+ * <p>
* An edge between a vertex's neighbors is a triangle. Counting edges between
* neighbors is equivalent to counting the number of triangles which include
* the vertex.
 * <br/>
+ * <p>
* The input graph must be a simple graph containing no duplicate edges or
* selfloops.
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriadicCensus.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriadicCensus.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriadicCensus.java
index 6c9e7b6..2274e3e 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriadicCensus.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriadicCensus.java
@@ 40,7 +40,7 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
/**
* A triad is formed by three connected or unconnected vertices in a graph.
* The triadic census counts the occurrences of each type of triad.
 * <br/>
+ * <p>
* http://vlado.fmf.unilj.si/pub/networks/doc/triads/triads.pdf
*
* @param <K> graph ID type
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriangleListing.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriangleListing.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriangleListing.java
index a1006c4..6b3e2a1 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriangleListing.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/directed/TriangleListing.java
@@ 51,11 +51,15 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
/**
* Generates a listing of distinct triangles from the input graph.
 * <br/>
+ * <p>
* A triangle is a 3clique with vertices A, B, and C connected by edges
* (A, B), (A, C), and (B, C).
 * <br/>
+ * <p>
* The input graph must not contain duplicate edges or selfloops.
+ * <p>
+ * This algorithm is similar to the undirected version but also tracks and
+ * computes a bitmask representing the six potential graph edges connecting
+ * the triangle vertices.
*
* @param <K> graph ID type
* @param <VV> vertex value type
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/LocalClusteringCoefficient.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/LocalClusteringCoefficient.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/LocalClusteringCoefficient.java
index 99e3db9..31ddf45 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/LocalClusteringCoefficient.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/LocalClusteringCoefficient.java
@@ 44,11 +44,11 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
* The local clustering coefficient measures the connectedness of each vertex's
* neighborhood. Scores range from 0.0 (no edges between neighbors) to 1.0
* (neighborhood is a clique).
 * <br/>
+ * <p>
* An edge between a vertex's neighbors is a triangle. Counting edges between
* neighbors is equivalent to counting the number of triangles which include
* the vertex.
 * <br/>
+ * <p>
* The input graph must be a simple, undirected graph containing no duplicate
* edges or selfloops.
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriadicCensus.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriadicCensus.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriadicCensus.java
index 3f59d0e..7482af0 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriadicCensus.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriadicCensus.java
@@ 38,10 +38,10 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
/**
* A triad is formed by three connected or unconnected vertices in a graph.
* The triadic census counts the occurrences of each type of triad.
 * <br/>
+ * <p>
* The four types of undirected triads are formed with 0, 1, 2, or 3
* connecting edges.
 * <br/>
+ * <p>
* http://vlado.fmf.unilj.si/pub/networks/doc/triads/triads.pdf
*
* @param <K> graph ID type
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriangleListing.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriangleListing.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriangleListing.java
index 8850125..09b9a5d 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriangleListing.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/clustering/undirected/TriangleListing.java
@@ 48,15 +48,16 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
/**
* Generates a listing of distinct triangles from the input graph.
 * <br/>
+ * <p>
* A triangle is a 3cycle with vertices A, B, and C connected by edges
* (A, B), (A, C), and (B, C).
 * <br/>
+ * <p>
* The input graph must be a simple, undirected graph containing no duplicate
* edges or selfloops.
 * <br/>
 * Algorithm from "Graph Twiddling in a MapReduce World", J. D. Cohen,
 * http://lintool.github.io/UMDcourses/bigdata2015Spring/content/Cohen_2009.pdf
+ * <p>
+ * Algorithm from "Finding, Counting and Listing all Triangles in Large Graphs,
+ * An Experimental Study", Thomas Schank and Dorothea Wagner.
+ * http://i11www.iti.unikarlsruhe.de/extra/publications/swfclt05_t.pdf
*
* @param <K> graph ID type
* @param <VV> vertex value type
@@ 178,7 +179,7 @@ extends GraphAlgorithmWrappingDataSet<K, VV, EV, Tuple3<K, K, K>>
{
/**
* Removes edge values while filtering such that only edges where the
* source vertex ID compares less than the target vertex ID are emitted.
 * <br/>
+ * <p>
* Since the input graph is a simple graph this filter removes exactly half
* of the original edges.
*
@@ 206,7 +207,7 @@ extends GraphAlgorithmWrappingDataSet<K, VV, EV, Tuple3<K, K, K>>
{
* vertex has lower degree are emitted. If the source and target vertex
* degrees are equal then the edge is emitted if the source vertex ID
* compares less than the target vertex ID.
 * <br/>
+ * <p>
* Since the input graph is a simple graph this filter removes exactly half
* of the original edges.
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/link_analysis/HITS.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/link_analysis/HITS.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/link_analysis/HITS.java
index ecc1ad7..f4195f7 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/link_analysis/HITS.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/link_analysis/HITS.java
@@ 53,9 +53,11 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
* HyperlinkInduced Topic Search computes two interdependent scores for every
* vertex in a directed graph. A good "hub" links to good "authorities" and
* good "authorities" are linked from good "hubs".
 *
+ * <p>
* This algorithm can be configured to terminate either by a limit on the number
* of iterations, a convergence threshold, or both.
+ * <p>
+ * http://www.cs.cornell.edu/home/kleinber/auth.pdf
*
* http://www.cs.cornell.edu/home/kleinber/auth.pdf
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/AdamicAdar.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/AdamicAdar.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/AdamicAdar.java
index 00819e4..a7ba00a 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/AdamicAdar.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/AdamicAdar.java
@@ 53,16 +53,16 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
/**
* http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf
 * <br/>
+ * <p>
* AdamicAdar measures the similarity between pairs of vertices as the sum of
* the inverse logarithm of degree over shared neighbors. Scores are nonnegative
* and unbounded. A vertex with higher degree has greater overall influence but
* is less influential to each pair of neighbors.
 * <br/>
+ * <p>
* This implementation produces similarity scores for each pair of vertices
* in the graph with at least one shared neighbor; equivalently, this is the
* set of all nonzero AdamicAdar coefficients.
 * <br/>
+ * <p>
* The input graph must be a simple, undirected graph containing no duplicate
* edges or selfloops.
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/JaccardIndex.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/JaccardIndex.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/JaccardIndex.java
index 148d541..11ec73d 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/JaccardIndex.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/library/similarity/JaccardIndex.java
@@ 48,11 +48,11 @@ import static org.apache.flink.api.common.ExecutionConfig.PARALLELISM_DEFAULT;
* is computed as the number of shared neighbors divided by the number of
* distinct neighbors. Scores range from 0.0 (no shared neighbors) to 1.0 (all
* neighbors are shared).
 * <br/>
+ * <p>
* This implementation produces similarity scores for each pair of vertices
* in the graph with at least one shared neighbor; equivalently, this is the
* set of all nonzero Jaccard Similarity coefficients.
 * <br/>
+ * <p>
* The input graph must be a simple, undirected graph containing no duplicate
* edges or selfloops.
*
@@ 240,10 +240,10 @@ extends GraphAlgorithmWrappingDataSet<K, VV, EV, Result<K>>
{
* This is the first of three operations implementing a selfjoin to generate
* the full neighbor pairing for each vertex. The number of neighbor pairs
* is (n choose 2) which is quadratic in the vertex degree.
 * <br/>
+ * <p>
* The third operation, {@link GenerateGroupPairs}, processes groups of size
* {@link #groupSize} and emits {@code O(groupSize * deg(vertex))} pairs.
 * <br/>
+ * <p>
* This input to the third operation is still quadratic in the vertex degree.
* Two prior operations, {@link GenerateGroupSpans} and {@link GenerateGroups},
* each emit datasets linear in the vertex degree, with a forced rebalance
@@ 318,7 +318,7 @@ extends GraphAlgorithmWrappingDataSet<K, VV, EV, Result<K>>
{
/**
* Emits the twopath for all neighbor pairs in this group.
 * <br/>
+ * <p>
* The first {@link #groupSize} vertices are emitted pairwise. Following
* vertices are only paired with vertices from this initial group.
*
http://gitwipus.apache.org/repos/asf/flink/blob/cb9e409b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/utils/proxy/OptionalBoolean.java

diff git a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/utils/proxy/OptionalBoolean.java
b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/utils/proxy/OptionalBoolean.java
index 70a1294..7a7208a 100644
 a/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/utils/proxy/OptionalBoolean.java
+++ b/flinklibraries/flinkgelly/src/main/java/org/apache/flink/graph/utils/proxy/OptionalBoolean.java
@@ 22,7 +22,7 @@ import org.apache.flink.graph.GraphAlgorithm;
/**
* A multistate boolean.
 * <br/>
+ * <p>
* This class is used by {@link GraphAlgorithm} configuration options to set a
* default value which can be overwritten. The default value is also used when
* algorithm configurations are merged and conflict.
