flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasiliki Kalavri <vasilikikala...@gmail.com>
Subject Re: LDBC Graph Data into Flink
Date Tue, 06 Oct 2015 08:53:02 GMT
Hi Martin,

thanks a lot for sharing! This is a very useful tool.
I only had a quick look, but if we merge label and payload inside a Tuple2,
then it should also be Gelly-compatible :)


On 6 October 2015 at 10:03, Martin Junghanns <m.junghanns@mailbox.org>

> Hi all,
> For our benchmarks with Flink, we are using a data generator provided by
> the LDBC project (Linked Data Benchmark Council) [1][2]. The generator uses
> MapReduce to create directed, labeled, attributed graphs that mimic
> properties of real online social networks (e.g, degree distribution,
> diameter). The output is stored in several files either local or in HDFS.
> Each file represents a vertex, edge or multi-valued property class.
> I wrote a little tool, that parses and transforms the LDBC output into two
> datasets representing vertices and edges. Each vertex has a unique id, a
> label and payload according to the LDBC schema. Each edge has a unique id,
> a label, source and target vertex IDs and also payload according to the
> schema.
> I thought this may be useful for others so I put it on GitHub [2]. It
> currently uses Flink 0.10-SNAPSHOT as it depends on some fixes made in
> there.
> Best,
> Martin
> [1] http://ldbcouncil.org/
> [2] https://github.com/ldbc/ldbc_snb_datagen
> [3] https://github.com/s1ck/ldbc-flink-import

View raw message