flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fobeligi <...@git.apache.org>
Subject [GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...
Date Tue, 28 Jun 2016 21:36:45 GMT
Github user fobeligi commented on a diff in the pull request:

    --- Diff: flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java
    @@ -408,6 +408,79 @@ public static GraphCsvReader fromCsvReader(String edgesPath, ExecutionEnvironmen
    +	 * Creates a graph from a Adjacency List text file  with Vertex Key values. Edges will
be created automatically.
    +	 *
    +	 * @param filePath a path to an Adjacency List text file with the Vertex data
    +	 * @param context  the execution environment.
    +	 * @return An instance of {@link org.apache.flink.graph.GraphAdjacencyListReader},
    +	 * on which calling methods to specify types of the Vertex ID, Vertex value and Edge
value returns a Graph.
    +	 */
    +	public static GraphAdjacencyListReader fromAdjacencyListFile(String filePath, ExecutionEnvironment
context) {
    +		return new GraphAdjacencyListReader(filePath, context);
    +	}
    +	/**
    +	 * Writes a graph as an Adjacency List formatted text file in a user specified folder.
    +	 *
    +	 * @param filePath   the path that the Adjacency List formatted text file should be
written in
    +	 * @param delimiters the delimiters that separate the different value types in the Adjacency
List formatted text
    +	 *                   file. Delimiters should be provided with the following order:
    +	 *                   NEIGHBOR_DELIMITER : separating source from its neighbors
    +	 *                   VERTICES_DELIMITER : separating the different neighbors of a source
    +	 *                   VERTEX_VALUE_DELIMITER: separating the source vertex-id from the
vertex value, as well as the
    +	 *                   target vertex-ids from the edge value.
    +	 */
    +	public void writeAsAdjacencyList(String filePath, String... delimiters) {
    +		final String NEIGHBOR_DELIMITER = delimiters.length > 0 ? delimiters[0] : "\t";
    +		final String VERTICES_DELIMITER = delimiters.length > 1 ? delimiters[1] : ",";
    +		final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ? delimiters[2] : "-";
    +		DataSet<Tuple2<K, VV>> vertices = this.getVerticesAsTuple2();
    +		DataSet<Tuple3<K, K, EV>> edgesNValues = this.getEdgesAsTuple3();
    --- End diff --
    As I see now, we don't have to convert the vertex set to tuple2 set, so I already changed
    Regarding the edges dataset, in order to write the Adjacency List file, I use the coGroup
transformation to the Vertex dataset and EdgesAsTuple3 dataset, where the vertexId equals
the source of the edge. 
    In that case, even when a Vertex is source to no edges (e.g. has only incoming edges),
I can still have the vertexId in the "coGrouped" dataset (I couldn't do that with a join).
    I can't think how I could use the Edge dataset in a coGroup or similar transformation.

    Please let me know if you have any suggestions.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message