hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Online machine learning on top of Hama BSP
Date Sat, 26 May 2012 09:26:12 GMT
Hi Ted,

please keep this factual, we are not here to start a flame war.
But to correct you, if you take a closter look at the mailing list
statistics [1]:
hama-commits: 1.51 mails per day (AVG)
Opposed to giraph:
giraph-commits: 0.68 mails per day (AVG)
So we have a more faster development than giraph.
Also we work on top of HDFS, so you can combine mapreduce jobs with BSP
jobs easily.
We are just not running inside of MapReduce, these things will neglect
anyways when YARN has a stable release.
Currently Hama can operate on YARN with it's on ApplicationMaster whereas
Giraph still needs to be on top of MapReduce.

Now to you Sebastian,

> Interesting discussion, which examples do you have in mind that might be
> easier representable in general BSP than in Giraph/Pregel?

straight forward translations from MPI for example. Someone of us is
currently working on a SVM implementation in BSP, which originally was
based on MPI.[2]
We would love to have this contributed to mahout, but if Ted is not
interested in Hama we will put this in our modules.
Also there are graph problems that need major supervision like Top-K
Shortest Paths, which cannot be easily expressed with aggregators.

We have benchmarks showing the scalability and maturity of Hama [3] and
would be glad to roll out to several other Apache projects.
BTW it would be cool if we could compare the performance of your k-means in
MapReduce with that of our BSP version, you see the benchmark in [3] as

Actually that was not why were are here, we wanted to hear some general
interest in real-time recommendation with Hama since all the ML guys are
here. Even if Ted is a fanboy of giraph ;)

Regards from Berlin,

[1] http://pulse.apache.org/#incubator.apache.org
[2] http://code.google.com/p/psvm/
[3] http://wiki.apache.org/hama/Benchmarks

2012/5/26 Ted Dunning <ted.dunning@gmail.com>

> On Fri, May 25, 2012 at 11:41 PM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
> > > Compared with Hama, what's the advantage of giraph? probably
> >
> > probably mature implementation? :D
> >
> Yes.  And very active community.  And recent history of rapid development.
>  And easy compatibility with map-reduce programs.

Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message