giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pascal Jäger <>
Subject Re: Network flow problems in Giraph
Date Thu, 07 Nov 2013 16:24:16 GMT

I just stumbled over this conversation (while looking for something different).
I found this algorithm a while ago and implemented it for giraph.

It is pretty much straight forward as you just have to put the map and reduce phase both in
the compute method.
I used aggregators to collect augmenting paths and to calculate the AugmentedEdges array.

Hope that helps :)

Regards Pascal

Great point Claudio, I had considered this in regards to Dijkstra/BFS
before when considering our own shortest paths example, but have not looked
at maxflow myself. That makes a lot of sense. Considering our earlier phone
call, this sort of thing brings up ideas about how to best make clear in
documentation not just what Giraph can do, but how it should do it. Lots to
think about here...

On Sat, Apr 20, 2013 at 1:42 PM, Claudio Martella <> wrote:

> One disclaimer first: not all iterative algorithms fit well to Giraph.
> Specifically to your example, Dijkstra's method for single source shortest
> path does not. The main general reason is that it does not scale in
> parallel, and specifically to Giraph because it requires centralised shared
> data. In fact, it makes more sense to run a less-efficient but massively
> parallelizable algorithm: Breadth-first search (BFS).
> Coming to maxflow problem, you'd have to look for one of the algorithms
> that is iterative and fits easily to a local vertex-centric view (where
> vertices exchange information through messages). Quite honestly, I haven't
> had the need/chance to look at a maxflow problem till now, so I don't have
> a solution for you right now. After a quick look, it appears to me that
> *could* be a
> good starting point.
> On Thu, Apr 18, 2013 at 4:41 PM, Jay Hutfles <> wrote:
>> Actually, I think a max flow problem fits exactly with the batch
>> processing model you describe.  You are given a massive graph (with
>> predefined maximum flows along the edges between vertices), you run a
>> program, it produces an output (i.e. the flow along each edge) and it
>> terminates.  It's not necessarily adapting to any new inputs as it runs.
>>  But it is an iterative process.  Or, at least, many algorithms are.
>> I don't see it as being that different from Djikstra's Method for finding
>> the shortest distance between two nodes on a graph.  Each super step is
>> updating the labels along the graph, and when all notes are labelled as
>> done, the algorithm finishes.  A max flow problem could be implemented
>> likewise, since there are labeling algorithms for determining the max flow
>> along a graph.
>> See, specifically the
>> Solutions section.  I think it should help clarify.
>> On Thu, Apr 18, 2013 at 9:16 AM, André Kelpe <
>>> wrote:
>>> Hi Jay,
>>> this sounds like a continuous operation to me. Giraph is meant for
>>> batch processing of massive graphs, which produces an output after a
>>> successful run. You run a program, it produces an output, it
>>> terminates. From what I understand, a stream processing framework like
>>> storm ( could be a
>>> better fit for this. Please let me know, if I am missing something.
>>> André
>>> 2013/4/18 Jay Hutfles <>:
>>> > I'm new to Giraph, but am interested in its applications to classic
>>> network
>>> > flow problems, specifically max flow or min cost problems.  I've
>>> looked for
>>> > BSP implementations of algorithms for these problems, but I can't find
>>> any
>>> > discussion regarding this online.  Has anyone had luck implementing
>>> such
>>> > problems?
>>> >
>>> >
>>> > The max flow problem seems like it should be adaptable to the BSP
>>> model.
>>> > The flow augmenting algorithm developed by Ford and Fulkerson is
>>> > essentially:
>>> >
>>> > while the graph contains a path over which flow could be increased,
>>> >    increase flow for arcs on the path
>>> >
>>> > Identifying the flow augmenting paths is a simple labeling algorithm,
>>> but
>>> > I'm not sure how I'd implement the "while the graph contains ..."
>>> condition.
>>> > Is that a super step above the labeling algorithm's super steps?
>>> >
>>> >
>>> > And I have no idea how to start the min cost algorithm.  Anyone have
>>> ideas
>>> > for how to formulate this?
>>> >
>>> > Thanks for your time, and for the great work on Giraph!
> --
>    Claudio Martella

View raw message