giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Aurelio Barbosa Fagnani Lotz <m.a.b.l...@stu12.qmul.ac.uk>
Subject RE: Dynamic Graphs
Date Fri, 06 Sep 2013 02:16:24 GMT
Hello all,

Answering Mr. Kampf question: In my personal opinion this tool would be indeed really useful,
since many of the real-world graphs are dynamic.
I have just finished a report of my research in the subject. The report is available at:

https://github.com/MarcoLotz/dynamicGraph/blob/master/LotzReport.pdf?raw=true

There is a first application that can do this injection. I am working in the minor modifications
that are proposed in the document right now. It is described in section 2.7
The previous sections just describes some experiences that I had with Giraph and an introduction
to the scenario.

Best Regards,
Marco Lotz
________________________________
From: Mirko Kämpf <mirko.kaempf@cloudera.com>
Sent: 25 August 2013 07:55
To: user@giraph.apache.org
Subject: Re: Dynamic Graphs

Good morning Gentlemen,

as far as I understand your thread you are talking about the same topic I was thinking and
working some time.
I work on a research project focused on evolution of networks and networks dynamics in networks
of networks.

My understanding of Marco's question is, that he needs to change node properties or even wants
to add nodes to the graph while it is processed, right?

With the WorkerContext we could construct a "Connector" to the outside world, not just for
loading data from HDFS, which requires a preprocessing step for the data which has to be loaded
also. I think about HBase often. All my nodes and edges live in HBase. From there it is quite
easy to load new data based on a simple "Scan" or even if the WorkerContext triggers a Hive
or Pig script, one can automatically reorganize or extract relevant new links / nodes which
have to be added to the graph.

Such an approach means, after n super steps of the Giraph layer an additional utility-step
(triggered via WorkerContext, or any other better fitting class form Giraph - not sure jet
there to start) is executed. Before such a step the state of the graph is persisted to allow
fall back or resume. The utility-step can be a processing (MR, Mahout) or just a load (from
HDFS, HBase) operation and it allows a kind of clocked data flow directly into a running Giraph
application. I think this is a very important feature in Complex Systems research, as we have
interacting layers which change in parallel. In this picture the Giraph steps are the steps
of layer A, lets say something whats going on on top of a network and the utility-step expresses
the changes in the underlying structure affecting the network it self but based on the data
/ properties of the second subsystem, e.g. the agents operating on top of the network.

I created a tool, which worked like this - but not at scale - and it was at a time before
Giraph. What do you think, is there a need for such a kind of extension in the Giraph world?

Have a nice Sunday.

Best wishes
Mirko

--
--
Mirko Kämpf

Trainer @ Cloudera

tel: +49 176 20 63 51 99
skype: kamir1604
mirko@cloudera.com<mailto:mirko@cloudera.com>



On Wed, Aug 21, 2013 at 3:30 PM, Claudio Martella <claudio.martella@gmail.com<mailto:claudio.martella@gmail.com>>
wrote:
As I said, the injection of the new vertices/edges would have to be done "manually", hence
without any support of the infrastructure. I'd suggest you implement a WorkerContext class
that supports the reading of a specific file with a specific format (under your control) from
HDFS, and that is accessed by this particular "special" vertex (e.g. based on the vertex ID).

Does this make sense?


On Wed, Aug 21, 2013 at 2:13 PM, Marco Aurelio Barbosa Fagnani Lotz <m.a.b.lotz@stu12.qmul.ac.uk<mailto:m.a.b.lotz@stu12.qmul.ac.uk>>
wrote:
Dear Mr. Martella,

Once achieved the conditions for updating the vertex data base, what it the best way for the
Injector Vertex to call an input reader again?

I am able to access all the HDFS data, but I guess the vertex would need to have access to
the input splits and also the vertex input format that I designate. Am I correct? Or there
is a way that one can just ask Zookeeper to create new splits and distribute to the workers
from given a path in DFS?

Best Regards,
Marco Lotz
________________________________
From: Claudio Martella <claudio.martella@gmail.com<mailto:claudio.martella@gmail.com>>
Sent: 14 August 2013 15:25
To: user@giraph.apache.org<mailto:user@giraph.apache.org>
Subject: Re: Dynamic Graphs

Hi Marco,

Giraph currently does not support that. One way of doing this would be by having a specific
(pseudo-)vertex to act as the "injector" of the new vertices and edges For example, it would
read a file from HDFS and call the mutable API during the computation, superstep after superstep.


On Wed, Aug 14, 2013 at 3:02 PM, Marco Aurelio Barbosa Fagnani Lotz <m.a.b.lotz@stu12.qmul.ac.uk<mailto:m.a.b.lotz@stu12.qmul.ac.uk>>
wrote:
Hello all,

I would like to know if there is any form to use dynamic graphs with Giraph. By dynamic one
can read graphs that may change while Giraph is computing/deliberating. The changes are in
the input file and are not caused by the graph computation itself.

Is there any way to analyse it using Giraph? If not, anyone has any idea/suggestion if it
is possible to modify the framework in order to process it?

Best Regards,
Marco Lotz



--
   Claudio Martella
   claudio.martella@gmail.com<mailto:claudio.martella@gmail.com>



--
   Claudio Martella
   claudio.martella@gmail.com<mailto:claudio.martella@gmail.com>





Mime
View raw message