hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Parallell maps
Date Fri, 03 Jul 2009 14:39:58 GMT
Mark Kerzner wrote:
> That's awesome information, Marcus.
> I am working on a project which would require a similar architectural
> solution (although unlike you I can't broadcast the details), so that was
> very useful. One thing I can say though is that mine is in no way a
> competitor, being in a different area.
> If I could find out more - would be even better. For example, how do you do
> Page Rank. Although I think that I have seen PageRank algorithm in MR
> somewhere (Google actually playfully revealing the secret), and surely
> Pregel promises this code in 15 lines.

Paolo's ranking code is now checked in to our public SVN repository:


* It may say LGPL on it, but we do plan to shortly do a bulk switch of
the entire code from that license to Apache, I'm just keeping every
header consistent until then.

* the build file is standalone.xml; builds against hadoop 18.4. the 
and ivy file, build it against my branch of 0.21, which is currently 
fixed to
  SVN_TRUNK of jun 12-13, pre-fork

* The code is interesting as its fairly CPU-intensive for not much
data; the opposite of terasort. I'm thinking of ways to do
power/performance tests on some machines to see what kind of box is
most energy efficient at doing the work.

Steve Loughran                  http://www.1060.org/blogxter/publish/5
Author: Ant in Action           http://antbook.org/

View raw message