flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Queries regarding RDFs with Flink
Date Sun, 22 Mar 2015 11:20:38 GMT
Gelly has a section in the docs, it should explain the vertex-centric
iterations. Is that not extensive enough?
Am 22.03.2015 12:04 schrieb "Flavio Pompermaier" <pompermaier@okkam.it>:

> Hi Stephan,
> thanks for the response. Unfortunately I'm not familiar with the new Gelly
> APIs and the old Spargel ones (I still don't understand the difference
> actually).
> Do you think it is possible to add such an example to the
> documentation/examples?
>
> Best,
> Flavio
>
>
>
> On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen <sewen@apache.org> wrote:
>
> > Hi Flavio!
> >
> > I see initially two ways of doing this:
> >
> > 1) Do a series of joins. You start with your subject and join two or
> three
> > times using the "objects-from-triplets == subject" to make one hop. You
> can
> > filter the verbs from the triplets before if you are only interested in a
> > special relationship.
> >
> > 2) If you want to recursively explode the subgraph (something like all
> > reachable subjects) or do a rather long series of hops, then you should
> be
> > able to model this nicely as a delta iterations, or as a vertex-centric
> > graph computation. For that, you can use both "Gelly" (the graph library)
> > or the standalone Spargel operator (Giraph-like).
> >
> > Does that help with your questions?
> >
> > Greetings,
> > Stephan
> >
> >
> > On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier <
> pompermaier@okkam.it>
> > wrote:
> >
> > > Hi to all,
> > > I'm back to this task again :)
> > >
> > > Summarizing again: I have some source dataset that has contains RDF
> > "stars"
> > > (SubjectURI, RdfType and a list of RDF triples belonging to this
> subject
> > ->
> > > the "a.k.a." star schema)
> > > and I have to extract some sub-graphs for some RDF types of interest.
> > > As described in the previous email I'd like to expand some root node
> (if
> > > its type is of interest) and explode some of its path(s).
> > > For example, if I'm interested in the expansion of rdf type Person (as
> in
> > > the example), I could want to create a mini-graph with all of its
> triples
> > > plus those obtained exploding the path(s)
> > > knows.marriedWith and knows.knows.knows.
> > > At the moment I do it with a punctual get from HBase but I didn't
> > > get whether this could be done more efficiently with other strategies
> in
> > > Flink.
> > > @Vasiliki: you said that I could need "something like a BFS from each
> > > vertex".  Do you have an example that could fit my use case? Is it
> > possible
> > > to filter out those vertices I'm interested in?
> > >
> > > Thanks in advance,
> > > Flavio
> > >
> > >
> > > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri <
> > > vasilikikalavri@gmail.com>
> > > wrote:
> > >
> > > > Hi Flavio,
> > > >
> > > > if you want to use Gelly to model your data as a graph, you can load
> > your
> > > > Tuple3s as Edges.
> > > > This will result in "http://test/John", "Person", "Frank", etc to be
> > > > vertices and "type", "name", "knows" to be edge values.
> > > > In the first case, you can use filterOnEdges() to get the subgraph
> with
> > > the
> > > > relation edges.
> > > >
> > > > Once you have the graph, you could probably use a vertex-centric
> > > iteration
> > > > to generate the trees.
> > > > It seems to me that you need something like a BFS from each vertex.
> > Keep
> > > in
> > > > mind that this can be a very costly operation in terms of memory and
> > > > communication for large graphs.
> > > >
> > > > Let me know if you have any questions!
> > > >
> > > > Cheers,
> > > > V.
> > > >
> > > > On 3 March 2015 at 09:13, Flavio Pompermaier <pompermaier@okkam.it>
> > > wrote:
> > > >
> > > > > I have a nice case of RDF manipulation :)
> > > > > Let's say I have the following RDF triples (Tuple3) in two files
or
> > > > tables:
> > > > >
> > > > > TABLE A:
> > > > > http://test/John, type, Person
> > > > > http://test/John, name, John
> > > > > http://test/John, knows, http://test/Mary
> > > > > http://test/John, knows, http://test/Jerry
> > > > > http://test/Jerry, type, Person
> > > > > http://test/Jerry, name, Jerry
> > > > > http://test/Jerry, knows, http://test/Frank
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > >
> > > > > TABLE B:
> > > > > http://test/Frank, type, Person
> > > > > http://test/Frank, name, Frank
> > > > > http://test/Frank, marriedWith, http://test/Mary
> > > > >
> > > > > What is the best way to build up Person-rooted trees with all
> node's
> > > data
> > > > > properties and some expanded path like 'Person.knows.marriedWith'
?
> > > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple
> punctuals
> > > get
> > > > > from a Key/value store or what?
> > > > >
> > > > > The expected 4 trees should be:
> > > > >
> > > > > tree 1 (root is John) ------------------
> > > > > http://test/John, type, Person
> > > > > http://test/John, name, John
> > > > > http://test/John, knows, http://test/Mary
> > > > > http://test/John, knows, http://test/Jerry
> > > > > http://test/Jerry, type, Person
> > > > > http://test/Jerry, name, Jerry
> > > > > http://test/Jerry, knows, http://test/Frank
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > > http://test/Frank, type, Person
> > > > > http://test/Frank, name, Frank
> > > > > http://test/Frank, marriedWith, http://test/Mary
> > > > >
> > > > > tree 2 (root is Jerry) ------------------
> > > > > http://test/Jerry, type, Person
> > > > > http://test/Jerry, name, Jerry
> > > > > http://test/Jerry, knows, http://test/Frank
> > > > > http://test/Frank, type, Person
> > > > > http://test/Frank, name, Frank
> > > > > http://test/Frank, marriedWith, http://test/Mary
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > >
> > > > > tree 3 (root is Mary) ------------------
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > >
> > > > > tree 4 (root is Frank) ------------------
> > > > > http://test/Frank, type, Person
> > > > > http://test/Frank, name, Frank
> > > > > http://test/Frank, marriedWith, http://test/Mary
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > >
> > > > > Thanks in advance,
> > > > > Flavio
> > > > >
> > > > > On Mon, Mar 2, 2015 at 5:04 PM, Stephan Ewen <sewen@apache.org>
> > wrote:
> > > > >
> > > > > > Hey Santosh!
> > > > > >
> > > > > > RDF processing often involves either joins, or graph-query like
> > > > > operations
> > > > > > (transitive). Flink is fairly good at both types of operations.
> > > > > >
> > > > > > I would look into the graph examples and the graph API for a
> start:
> > > > > >
> > > > > >  - Graph examples:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/tree/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph
> > > > > >  - Graph API:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph
> > > > > >
> > > > > > If you have a more specific question, I can give you better
> > pointers
> > > > ;-)
> > > > > >
> > > > > > Stephan
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 27, 2015 at 4:48 PM, santosh_rajaguru <
> > sanit4u@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > how can flink be useful for processing the data to RDFs
and
> build
> > > the
> > > > > > > ontology?
> > > > > > >
> > > > > > > Regards,
> > > > > > > Santosh
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > View this message in context:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Queries-regarding-RDFs-with-Flink-tp4130.html
> > > > > > > Sent from the Apache Flink (Incubator) Mailing List archive.
> > > mailing
> > > > > list
> > > > > > > archive at Nabble.com.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message