incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco De Francisci Morales <g...@apache.org>
Subject Re: [Announcement] Giraph talk in Berlin on May 29th
Date Mon, 14 May 2012 21:03:48 GMT
Hi,


> It would be good to present users a couple of non trivial examples and one
> or
> two 'real' use cases where Apache Giraph is used for processing large
> graphs.
> Apache Giraph comes with two examples: all shortest paths from a single
> source
> and PageRank. Google's Pregel paper describes 'bipartite matching' and
> 'semi-clustering'. Is anyone working on implementing these in Giraph?
> Or, what if in the shortest paths example you actually want to know the
> path?
>
>
I have some toy code (not really well tested) that implements b-matching
(that is matching with integer capacities on the nodes).
It's a simple greedy method, along the lines of the one described here
www.vldb.org/pvldb/vol4/p460-morales.pdf

I can share it if you are interested.

Cheers,
--
Gianmarco



It would be great to have examples on more advanced features: custom
> partitioning functions, aggregators, ...
>
> Personally, I'd like to see a side-by-side comparison of Google's Pregel as
> described in their paper and Giraph implementation (I am particularly
> interested
> on where they diverge and why).
>
> Another question (or thing I am not so sure about) is about 'capacity
> planning'
> (sort of...). Given a dataset and an algorithm implemented in Giraph, how
> you
> determine how many workers would be needed (in order to fit all your graph
> and
> messages for each superstep in RAM)?
>
> Last but not least, it seems to me that PageRank is what you use to
> 'benchmark'
> Giraph, is that the case? If that is the case, sharing a common dataset for
> others to use would be a first initial step to allow people to compare
> performances of different software running the very same algorithm, over
> the
> same data and the same hardware infrastructure.
>
> Paolo
>
> Sebastian Schelter wrote:
> > Hi,
> >
> > I will give a talk titled "Large Scale Graph Processing with Apache
> > Giraph" in Berlin on May 29th. Details are available at:
> >
> >
> https://www.xing.com/events/gameduell-tech-talk-on-the-topic-large-scale-graph-processing-with-apache-giraph-1092275
> >
> > Best,
> > Sebastian
>
>

Mime
View raw message