flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasiliki Kalavri <vasilikikala...@gmail.com>
Subject Re: Queries regarding RDFs with Flink
Date Tue, 03 Mar 2015 19:32:51 GMT
Hi Flavio,

if you want to use Gelly to model your data as a graph, you can load your
Tuple3s as Edges.
This will result in "http://test/John", "Person", "Frank", etc to be
vertices and "type", "name", "knows" to be edge values.
In the first case, you can use filterOnEdges() to get the subgraph with the
relation edges.

Once you have the graph, you could probably use a vertex-centric iteration
to generate the trees.
It seems to me that you need something like a BFS from each vertex. Keep in
mind that this can be a very costly operation in terms of memory and
communication for large graphs.

Let me know if you have any questions!

Cheers,
V.

On 3 March 2015 at 09:13, Flavio Pompermaier <pompermaier@okkam.it> wrote:

> I have a nice case of RDF manipulation :)
> Let's say I have the following RDF triples (Tuple3) in two files or tables:
>
> TABLE A:
> http://test/John, type, Person
> http://test/John, name, John
> http://test/John, knows, http://test/Mary
> http://test/John, knows, http://test/Jerry
> http://test/Jerry, type, Person
> http://test/Jerry, name, Jerry
> http://test/Jerry, knows, http://test/Frank
> http://test/Mary, type, Person
> http://test/Mary, name, Mary
>
> TABLE B:
> http://test/Frank, type, Person
> http://test/Frank, name, Frank
> http://test/Frank, marriedWith, http://test/Mary
>
> What is the best way to build up Person-rooted trees with all node's data
> properties and some expanded path like 'Person.knows.marriedWith' ?
> Is it better to use Graph/Gelly APIs, Flink Joins, multiple punctuals get
> from a Key/value store or what?
>
> The expected 4 trees should be:
>
> tree 1 (root is John) ------------------
> http://test/John, type, Person
> http://test/John, name, John
> http://test/John, knows, http://test/Mary
> http://test/John, knows, http://test/Jerry
> http://test/Jerry, type, Person
> http://test/Jerry, name, Jerry
> http://test/Jerry, knows, http://test/Frank
> http://test/Mary, type, Person
> http://test/Mary, name, Mary
> http://test/Frank, type, Person
> http://test/Frank, name, Frank
> http://test/Frank, marriedWith, http://test/Mary
>
> tree 2 (root is Jerry) ------------------
> http://test/Jerry, type, Person
> http://test/Jerry, name, Jerry
> http://test/Jerry, knows, http://test/Frank
> http://test/Frank, type, Person
> http://test/Frank, name, Frank
> http://test/Frank, marriedWith, http://test/Mary
> http://test/Mary, type, Person
> http://test/Mary, name, Mary
>
> tree 3 (root is Mary) ------------------
> http://test/Mary, type, Person
> http://test/Mary, name, Mary
>
> tree 4 (root is Frank) ------------------
> http://test/Frank, type, Person
> http://test/Frank, name, Frank
> http://test/Frank, marriedWith, http://test/Mary
> http://test/Mary, type, Person
> http://test/Mary, name, Mary
>
> Thanks in advance,
> Flavio
>
> On Mon, Mar 2, 2015 at 5:04 PM, Stephan Ewen <sewen@apache.org> wrote:
>
> > Hey Santosh!
> >
> > RDF processing often involves either joins, or graph-query like
> operations
> > (transitive). Flink is fairly good at both types of operations.
> >
> > I would look into the graph examples and the graph API for a start:
> >
> >  - Graph examples:
> >
> >
> https://github.com/apache/flink/tree/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph
> >  - Graph API:
> >
> >
> https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph
> >
> > If you have a more specific question, I can give you better pointers ;-)
> >
> > Stephan
> >
> >
> > On Fri, Feb 27, 2015 at 4:48 PM, santosh_rajaguru <sanit4u@gmail.com>
> > wrote:
> >
> > > Hello,
> > >
> > > how can flink be useful for processing the data to RDFs and build the
> > > ontology?
> > >
> > > Regards,
> > > Santosh
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Queries-regarding-RDFs-with-Flink-tp4130.html
> > > Sent from the Apache Flink (Incubator) Mailing List archive. mailing
> list
> > > archive at Nabble.com.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message