Hey,
you can preprocess your data, create the vertices and store them to a file,
like you would store any other Flink DataSet, e.g. with writeAsText.
Then, you can create the graph by reading 2 datasets, like this:
DataSet<Vertex> vertices = env.readTextFile("/path/to/vertices/")... // or
your custom reading logic
DataSet<Edge> edges = ...
Graph graph = Graph.fromDataSet(vertices, edges, env);
Is this what you're looking for?
Also, note that if you have a very large graph, you should avoid using
collect() and fromCollection().
Vasia.
On 25 November 2015 at 18:03, Stefanos Antaris <antaris.stefanos@gmail.com>
wrote:
> Hi Vasia,
> my graph object is the following:
> Graph<MyPojoNode, NullValue, Integer> graph = Graph.fromCollection(
> edgeList.collect(), env);
> The vertex is a POJO not the value. So the problem is how could i store
> and retrieve the vertex list?
>
> Thanks,
> Stefanos
> On 25 Nov 2015, at 18:16, Vasiliki Kalavri <vasilikikalavri@gmail.com>
> wrote:
>
> Hi Stefane,
>
> let me know if I understand the problem correctly. The vertex values are
> POJOs that you're somehow inferring from the edge list and this value
> creation is what takes a lot of time? Since a graph is just a set of 2
> datasets (vertices and edges), you could store the values to disk and have
> a custom input format to read them into datasets. Would that work for you?
> Vasia.
>
> On 25 November 2015 at 15:09, Stefanos Antaris <antaris.stefanos@gmail.com
>
>> Hi to all,
>>
>> i am working on a project with Gelly and i need to create a graph with
>> billions of nodes. Although i have the edge list, the node in the Graph
>> needs to be a POJO object, the construction of which takes long time in
>> order to finally create the final graph. Is it possible to store the Graph
>> object as a file and retrieve it whenever i want to run an experiment?
>> Thanks,
>> Stefanos
