giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Network flow problems in Giraph
Date Sat, 20 Apr 2013 17:42:46 GMT
One disclaimer first: not all iterative algorithms fit well to Giraph.
Specifically to your example, Dijkstra's method for single source shortest
path does not. The main general reason is that it does not scale in
parallel, and specifically to Giraph because it requires centralised shared
data. In fact, it makes more sense to run a less-efficient but massively
parallelizable algorithm: Breadth-first search (BFS).

Coming to maxflow problem, you'd have to look for one of the algorithms
that is iterative and fits easily to a local vertex-centric view (where
vertices exchange information through messages). Quite honestly, I haven't
had the need/chance to look at a maxflow problem till now, so I don't have
a solution for you right now. After a quick look, it appears to me that
http://en.wikipedia.org/wiki/Dinitz_blocking_flow_algorithm *could* be a
good starting point.




On Thu, Apr 18, 2013 at 4:41 PM, Jay Hutfles <jayhutfles@gmail.com> wrote:

> Actually, I think a max flow problem fits exactly with the batch
> processing model you describe.  You are given a massive graph (with
> predefined maximum flows along the edges between vertices), you run a
> program, it produces an output (i.e. the flow along each edge) and it
> terminates.  It's not necessarily adapting to any new inputs as it runs.
>  But it is an iterative process.  Or, at least, many algorithms are.
>
> I don't see it as being that different from Djikstra's Method for finding
> the shortest distance between two nodes on a graph.  Each super step is
> updating the labels along the graph, and when all notes are labelled as
> done, the algorithm finishes.  A max flow problem could be implemented
> likewise, since there are labeling algorithms for determining the max flow
> along a graph.
>
> See http://en.wikipedia.org/wiki/Maximum_flow_problem, specifically the
> Solutions section.  I think it should help clarify.
>
>
>
> On Thu, Apr 18, 2013 at 9:16 AM, André Kelpe <
> efeshundertelf@googlemail.com> wrote:
>
>> Hi Jay,
>>
>> this sounds like a continuous operation to me. Giraph is meant for
>> batch processing of massive graphs, which produces an output after a
>> successful run. You run a program, it produces an output, it
>> terminates. From what I understand, a stream processing framework like
>> storm (https://github.com/nathanmarz/storm/wiki/Rationale) could be a
>> better fit for this. Please let me know, if I am missing something.
>>
>> André
>>
>> 2013/4/18 Jay Hutfles <jayhutfles@gmail.com>:
>> > I'm new to Giraph, but am interested in its applications to classic
>> network
>> > flow problems, specifically max flow or min cost problems.  I've looked
>> for
>> > BSP implementations of algorithms for these problems, but I can't find
>> any
>> > discussion regarding this online.  Has anyone had luck implementing such
>> > problems?
>> >
>> >
>> > The max flow problem seems like it should be adaptable to the BSP model.
>> > The flow augmenting algorithm developed by Ford and Fulkerson is
>> > essentially:
>> >
>> > while the graph contains a path over which flow could be increased,
>> >    increase flow for arcs on the path
>> >
>> > Identifying the flow augmenting paths is a simple labeling algorithm,
>> but
>> > I'm not sure how I'd implement the "while the graph contains ..."
>> condition.
>> > Is that a super step above the labeling algorithm's super steps?
>> >
>> >
>> > And I have no idea how to start the min cost algorithm.  Anyone have
>> ideas
>> > for how to formulate this?
>> >
>> > Thanks for your time, and for the great work on Giraph!
>>
>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message